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Preface 



This volume of the LNCS is the formal proceedings of the 2nd European Symposium 
on Ambient Intelligence, EUSAI 2004. This event was held on November 8-10, 2004 
at the Eindhoven University of Technology, in Eindhoven, the Netherlands. 

EUSAI 2004 followed a successful first event in 2003, organized by Philips 
Research. This turned out to be a timely initiative that created a forum for bringing 
together European researchers, working on different disciplines all contributing 
towards the human-centric technological vision of ambient intelligence. Compared to 
conferences working on similar and overlapping fields, the first EUSAI was 
characterized by a strong industrial focus reflected in the program committee and the 
content of the program. As program chairs of EUSAI 2004 we tried to preserve the 
character for this event and its combined focus on the four major thematic areas: 
ubiquitous computing, context awareness, intelligence, and natural interaction. 
Further, we tried to make EUSAI 2004 grow into a full-fledged double-track 
conference, with surrounding events like tutorials and specialized workshops, a poster 
and demonstration exhibition and a student design competition. The conference 
program included three invited keynotes, Ted Selker from MIT, Tom Rodden from 
the University of Nottingham and Tom Erickson from IBM. 

Out of 90 paper submissions received for the conference, 36 were selected for 
inclusion in this volume. Papers were submitted anonymously and 3-5 anonymous 
reviewers reviewed each. The review committee included experts from each of the 
four thematic areas mentioned above representing both academia and industry. The 
four program co-chairs made the final selection of papers for the proceedings. In this 
process, special attention was devoted to divergent reviews that arose from the 
multidisciplinary nature of this emerging field. We are very confident of the rigor and 
high standard of this review process that safeguarded the quality of the final 
proceedings and ensured fairness to the contributing authors. 

The papers in this volume are clustered into four groups: 

ubiquitous computing: software architectures, communication and 

distribution, 

context sensing and machine perception, 
algorithms and ontologies for learning and adaptation, 
human computer interaction in ambient intelligence. 

We hope the result of this collective effort shall be rewarding for readers. We wish 
to thank all authors who submitted their articles to EUSAI and especially the authors 
of the selected papers for their efforts in improving their papers according to the 
reviews they received in preparation of this volume. We thank the members of the 
review committee for their hard work and expert input and especially for responding 
so well when their workload exceeded our original expectations. 

We gratefully acknowledge the support by the JFS Schouten School for Research 
in User System Interaction, the Department of Industrial Design at TU/e, IOP-MMI 
Senter, the Royal Dutch Academy of Arts and Sciences (KNAW), Philips and Oce. 

We wish to thank all those who supported the organization of EUSAI 2004 and 
who worked hard to make it a success. Specifically, we thank Harm van Essen, Elise 
van de Hoven, Evelien Perik, Natalia Romero, Andres Lucero and Franka van 




VI 



Preface 



Neerven for their work and commitment in organizing, publicizing and running this 
event. We thank also the special category co-chairs Wijnand IJsselsteijn, Gerd 
Kortuem, Ian McClelland, Kristof van Laerhoven and Boris de Ruyter. We note here 
that an adjunct proceedings including extended abstracts for posters, tutorials, 
demonstrations and workshops was published separately. 

Closing this preface, we wish to express our hope that this volume provides a 
useful reference for researchers in the field and that our efforts to make EUSAI 2004 
possible contributed to the building of a community of researchers from industry and 
academia that will pursue research in the field of ambient intelligence. 

Eindhoven Panos Markopoulos 

August 2004 Berry Eggen 

Emile Aarts 
James Crowley 
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Super-distributed RFID Tag Infrastructures 



Jiirgen Bohn and Friedemann Mattern 



Institute for Pervasive Computing 
ETH Zurich, Switzerland 
{bohn,mattern}@inf . ethz . ch 



Abstract. With the emerging mass production of very small, cheap Radio Fre- 
quency Identification (RFID) tags, it is becoming feasible to deploy such tags 
on a large scale. In this paper, we advocate distribution schemes where passive 
RFID tags are deployed in vast quantities and in a highly redundant fashion over 
large areas or object surfaces. We show that such an approach opens up a whole 
spectrum of possibilities for creating novel RFID-based services and applications, 
including a new means of cooperation between mobile physical entities. We also 
discuss a number of challenges related to this approach, such as the density and 
structure of tag distributions, and tag typing and clustering. Finally, we outline two 
prototypical applications (a smart autonomous vacuum cleaner and a collaborative 
map-making system) and indicate future directions of research. 



1 Introduction 

In industry, the potential of radio frequency identification (RFID) technology was first 
recognized in the 1990s, stimulating the desire for RFID-supported applications such as 
product tracking, supply chain optimization, asset and tool management, and inventory 
and counterfeit control [22]. Besides these “conventional” application areas, passive 
RFID tags are also suited to augmenting physical objects with virtual representations 
or computational functionality, providing a versatile technology for “bridging physical 
and virtual worlds” in ubiquitous computing environments, as Want et al. showed [23] . 

Currently, the proliferation of RFID technology is advancing rapidly, while RFID 
reader devices, antennas and tags are becoming increasingly smaller and cheaper. As a 
result, the deployment of RFID technology on a larger scale is about to become both 
technically and economically feasible. Hitachi, for instance, is about to commence mass 
production of the mu-chip [11], which is a miniature RFID tag with a surface area of 
0.3 mm 2 . Further, the Auto-ID Center has proposed methods which could lower the cost 
per RFID chip to approx, five US cents [19]. 

In the conventional process of RFID tag deployment prevailing today, only a limited 
number of passive tags are placed in the environment in a deliberate and sparse fashion. 
Typically, RFID tags are mainly used for identifying objects [6,24] and for detecting 
the containedness relationships of these objects [14], Explicitly placed stationary tags 
embedded in the environment also serve as dedicated artificial landmarks. They can be 
detected by means of a mobile RFID reader and are used to support the navigation of 
mobile devices and robots [13,16,17], or to mark places and passageways [10]. 

In this paper, we present the concept of super-distributed RFID tag infrastructures. 
which differs from the conventional means of RFID tag deployment and utilization. We 
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advocate massively-redundant tag distributions where cheap passive RFID tags (i.e. tags 
without a built-in power supply) are deployed in large quantities and in a highly redundant 
fashion over large areas or object surfaces. We show that, in so doing, the identity of a 
single tag becomes insignificant in exchange for an increased efficiency, coverage, and 
robustness of the infrastructure thus created as a whole. We further demonstrate that such 
an approach opens up a whole spectrum of possibilities for creating novel RFID-based 
services and location-dependent applications, including a new means of cooperation 
between mobile entities. We also discuss some of the technological opportunities and 
challenges, with the intention of stimulating further research in this area. 

The remainder of the paper is organized as follows: In Section 2 we introduce the 
concept of super-distributed RFID-tag infrastructures and describe its particular quali- 
ties in detail. In Section 3, we discuss different means of deploying RFID tags efficiently 
and redundantly on a large scale. Then, in Section 4, we outline two prototypical appli- 
cations (a smart autonomous vacuum cleaner and a collaborative map-making system) 
and indicate future directions of research. 



2 Super-distributed RFID Tag Infrastructures 

Passive RFID tags typically incorporate a miniature processing unit and a circuit for 
receiving power if the tag is brought within the field of an RFID reader. The tags are usu- 
ally attached to mobile objects such as supermarket goods or other consumer products, 
and they send their identity to the reader over distances ranging from a few centimeters 
up to a few meters, depending on the type of tag. 

RFID tags that are spread across a particular space in large redundant quantities can 
in turn be regarded as a “super-distributed” collection of tiny, immobile smart objects. 
The term “super-distribution” refers to the fact that a vast number of tags are involved, 
similar to the notion of “super-distributed objects” in [20]. Accordingly, we will refer to 
such a highly redundant tag distribution as a super-distributed RFID tag infrastructure 
(SDRI). 

A highly redundant and dense distribution of tiny objects is also a common charac- 
teristic of wireless sensor networks which consist of a large number of very compact, 
autonomous sensor nodes. However, the two concepts differ fundamentally: in contrast 
to a fixed structure of independent and passive tags as part of an SDRI, wireless sen- 
sor networks are based on the “collaborative effort of a large number of nodes” [3]. 
Further, the topology of wireless sensor networks may change due to mobility on the 
part of its nodes. In addition, wireless sensor nodes carry their own power supply used 
to enable active sensing, data processing, and communication with other sensor nodes, 
whereas passive RFID tags only have very limited functionality, generally restricted to 
reading and writing a small amount of data. Also, compared to typical wireless sensor 
networks with nodes communicating over distances of tens of meters or more, mobile 
RFID antennas generally operate at a much shorter range. 

By deploying an SDRI in an area, the overall physical space is divided into tagged 
and thus uniquely identifiable physical locations. This means that each tag can be used 
as an identifier for the precise location it covers, where coverage is pragmatically defined 
as the reading range of the tag. What we thus obtain can be described as an approximate 
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“discrete partitioning” of the physical space, of which different implementations are 
possible. If the tags of an SDRI are distributed according to a regular grid pattern, for 
example, then the partitioning of the physical space itself can be considered a physical 
grid of uniquely addressable cells approximating to the concept of a regular occupancy 
grid, as applied in the field of mobile robot navigation [7], for instance. If, on the other 
hand, the tags of the SDRI are distributed in a random fashion, we obtain an irregular 
pattern of uniquely addressable cells. Ideally, these cells are non-overlapping and cover 
the whole area. In practice, one can only approximate these properties. 

In addition to the massive potential redundancy of RFID tags, two particularly in- 
teresting qualities of SDRIs are that they enable mobile devices to interact with their 
local physical environment, and that such an interaction can be performed in a highly 
distributed and concurrent manner. In the following, we explain these qualities in more 
detail. 



2.1 Local Interaction with Physical Places 

An SDRI allows mobile objects to store and retrieve data in the precise geographic 
location in which they are situated by writing to or reading from nearby RFID tags. 
Independent, anonymous entities are thus in a position to share knowledge and context 
information in situ. 

One potential application of this quality is self-describing and self-announcing lo- 
cations, where mobile devices can gain contextual or topological information on the 
spot simply by querying the local part of the RFID tag infrastructure. For instance, mo- 
bile GPS-enhanced vehicles could locally store positioning information while moving 
within an SDRI. Once a sufficiently large proportion of an affected area has thus been 
initialized, other mobile devices can be helped to recalibrate their GPS receivers, and 
GPS-less devices can be enabled to determine their position without using a dedicated 
positioning system themselves. Positional information stored in the SDRI can also be 
used to establish a fail-back service in case the primary positioning service is temporarily 
unavailable, thus increasing the overall availability of positional information in the area. 
Further, an SDRI facilitates the definition of arbitrary regions within physical spaces: 
virtual zones, barriers and markers can easily be defined by marking particular tags (or 
the tags along a border line) in the SDRI accordingly. 

Furthermore, SDRIs in general offer physical anchor points which can serve as entry 
points into virtual spaces by allowing mobile devices to leave data traces, messages or 
links to virtual information (residing in a background infrastructure) wherever they roam. 
It is therefore possible to use the RFID tags of an SDRI as an alternative medium for 
implementing physical hyperlinks [12,18] in virtual spaces, or as a means of attaching 
virtual annotations [21] to physical places. 

By providing a means for roaming mobile objects to anchor and thus persistently 
store location-dependent sensor information on the spot, SDRIs also constitute a self- 
sufficient alternative to services such as GeoWiki [8], where virtual information is linked 
to a geographical address, but which require explicit knowledge of the current geographic 
location or the continuous availability of a location service of a sufficiently high resolu- 
tion and accuracy. 
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Fig. 1. A mobile object has left a virtual trace on an RFID tagged floor space. Here, each tag that 
is part of the trace not only contains information about the identity of the tracked physical entity, 
but also indicates changes in its direction. 



Mobile objects may also leave virtual traces in the physical space they traverse. By 
writing an anonymous ID to writeable tags 1 in the floor space, a vehicle, for example, can 
leave a trace which can subsequently be retraced and followed by other mobile objects 
(see Fig. 1 ). These traces could later be overwritten by other vehicles or persons, so that 
they would fade away over time. By exploiting tag redundancy and enforcing suitable 
tag writing strategies, it should be possible to prevent the immediate deletion of a newly 
laid trace by following objects. For example, a moving vehicle could randomly choose 
one tag out of k available tags per location. Furthermore, a single tag could store the IDs 
of several different traces. 

2.2 Global Collaboration Between Mobile Objects 

An SDRI can be regarded as a scalable shared medium with (almost) unlimited, inde- 
pendent, and highly distributed physical “access points”. In this respect, an SDRI is 
particularly conducive to supporting global collaboration, where several independent 
physical entities work together on a single task in a highly decentralized and concurrent 
fashion. This is possible since SDRIs enable mobile devices to store and access data 
locally at the precise location they occupy at a given moment, so that devices which are 
situated at different locations within an SDRI can read from or write to tags simultane- 
ously and independently. 

Consequently, SDRIs are well suited for the implementation of a number of collabo- 
rative applications. Mobile objects that gather location-dependent information can store 
that information directly at the respective location by means of the SDRI, resulting in 
teamwork between anonymous entities that can be exploited for initializing or “boot- 
strapping” a particular SDRI with positional or topological information, for instance. 
Further, mobile objects can, depending on their capabilities, participate on the fly to 
achieve such a common goal while actually pursuing a different primary objective. 

One concrete example of this is the collaborative exploration of an area. Since ob- 
servations are based on globally unique tag IDs, the different map-making observations 

1 Tags are physically writable if an RFID scanner can be used to write data directly onto the tags. 
However, it is also possible to virtually write data onto read-only tags by means of a suitable 
background infrastructure, as described in Section 2.4. 
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of independent mobile entities can be unambiguously combined to form a global map 
(see also the prototype description below in Section 4). Such an SDRI-aided collabora- 
tive map-making process scales well as it can be performed by an arbitrary number of 
concurrent entities in a highly decentralized fashion. 

Of course, for some of the scenarios described to become possible, both technical 
challenges (e.g., reading from or writing to RFID tags at high velocities) and conceptual 
issues (such as the question of commonly understandable topological models and on- 
tologies, or the problem of updating information previously stored in the SDRI in order 
to reflect changes in the environment) have to be addressed. 

2.3 Scope of Deployment 

So far we have discussed a number of scenarios where SDRIs are deployed over large 
floor spaces in order to provide a novel means of local interaction and global collabo- 
ration. However, the scope of SDRI deployment is not just limited to such large-scale 
scenarios. There are also various situations where smaller-scale SDRIs have their distinct 
benefits. 

For a table equipped with RFID tags embedded in its surface, for example, the 
intrinsic qualities of SDRIs still apply: such a “small-scale” SDRI also yields uniquely 
addressable cells, allowing multiple devices (or users) to interact with different sections 
of the tabletop simultaneously. If the tag distribution of the tabletop is known or if each 
tag knows its position with respect to the local coordinate system of the table, a smart 
object on the table can also easily determine its position with regard to the tabletop, or 
derive the relative distance from other smart objects on top of the table by communicating 
and exchanging particular tag IDs or tag coordinates. Similarly, a wall whose surface 
is coated with a layer of RFID tags can be turned into a smart “notice-board” featuring 
support for the positioning of objects that are attached to it. 

2.4 Physical Versus Virtual Tags 

If RFID tags support read- write operations, then they obviously enable mobile devices to 
store a certain amount of data directly on the physical tags themselves. As a consequence, 
a mobile device can read from and write to the physical matter at its respective location, 
literally speaking. 

Accordingly, if the available physical RFID tags are of a read-only type, a mobile de- 
vice cannot directly write data to the tags. However, in this case we can still use the unique 
ID of the tags to unambiguously map each physical tag of the SDRI to a corresponding 
virtual tag residing in the background infrastructure. Rather than writing to a physical 
tag within the direct range of the mobile device, the device instead wirelessly connects to 
the virtual representation of the tag. The virtual tag may either simply provide the basic 
data read/write operations of a physical read-write tag, or even augment its capabilities 
by offering additional services which cannot be implemented on the small physical tags 
themselves due to resource limitations. So one distinctive advantage of virtual tags over 
mere physical tags is that they don’t suffer from physical resource limitations, enabling 
us to write an almost unlimited quantity of data to a virtual representation of the physical 
tag. 
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However, to ensure instantaneous read/write access to virtual tags, continuous wire- 
less access to a suitable background infrastructure that manages the virtually written 
tag-data is needed. In this case, the SDRI is no longer self-sufficient in so far as it needs 
to be continuously connected to the background infrastructure. 



3 Efficient and Redundant Large-Scale Deployment of RFID Tags 

The deployment of large quantities of tiny RFID tags over large areas necessitates an 
efficient means of tag distribution. Instead of minutely distributing tags across an area 
according to rigid and well-defined patterns, which typically goes hand in hand with 
a time-consuming calibration of these tags, we think that a highly redundant random 
distribution of tags is often more favorable. Such a random distribution can be achieved 
in various ways. For instance, RFID tags could be randomly mixed with various building 
materials such as paint, floor screed, concrete, etc., which is similar to the idea of mixing 
computer particles with “bulk materials” as described by Abelson et al. in [1], Of course, 
in some cases such a procedure requires quite durable and resilient tags. But even if a 
certain percentage of tags were to be rendered defective in the process, the number of 
operable tags could be controlled by applying the necessary degree of redundancy. 

If tags are uniformly distributed in a random manner over large areas, we can make 
assumptions about the average tag density and the coverage of the area. In some cases 
we can even randomly distribute tags and still maintain a certain regular distribution 
structure. By integrating RFID tags at regular intervals with string or the thread used 
for weaving carpets, for instance, it would be feasible to weave a complete carpet or 
produce carpet tiles that exhibited a regular RFID tag texture (e.g., forming a mesh of 
RFID tags). Even though we would not know the absolute positions of the tags after 
an RFID-augmented carpet had been laid out in a random fashion, we would still have 
relatively precise information about the distances between neighboring tags and about 
the overall tag density. 

Although a random large-scale tag distribution enables a dense and, on average, 
uniformly distributed coverage comparatively cheaply and easily, it does pose some 
challenges. For instance, if the cost per RFID tag is too high even in mass production, a 
dense large-scale deployment may not be economically feasible. Further issues are the 
durability of RFID tags that are embedded in a carrier material, and the “bootstrapping” 
of super-distributed RFID infrastructures with respect to positioning and the provisioning 
of location-dependent context information. 



3.1 Deployment of RFID Readers Versus Tags 

Rather than tagging large areas with small, cheap RFID tags, it is also possible to 
distribute RFID reader antennas instead. By integrating an array of stationary RFID 
antennas into the floor, as described by [2], it is possible to detect tags that pass over 
particular readers, or even track certain tags if the output of several readers is combined 
and analyzed. Thus, by simply attaching a passive RFID tag to a mobile device, the latter 
is freed from the extra burden imposed by an energy-consuming RFID scanner. 
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However, such an approach has several disadvantages compared to the concept of 
SDRIs. First, integrating readers or antennas into the floor or into surfaces is resource- 
intensive, as quite a large quantity of expensive RFID equipment is needed to achieve 
a good resolution and coverage. For the continuous operation of these RFID readers 
and antennas, additional energized electronic devices are required. Furthermore, the 
deployment of the RFID equipment is complex and may require a considerable amount 
of construction work, not to mention the costly maintenance in the event that a device 
has to be replaced at a later point in time. Also, if the antennas are embedded into the 
environment and the mobile entities are tagged with passive RFID tags, the latter have 
only a limited means of controlling their degree of “visibility”. The mobile objects cannot 
easily prevent themselves from being detected or even tracked by the environment. If, on 
the other hand, the environment is being tagged and remains passive, the mobile entity 
itself is performing the sensing. Consequently, the mobile device remains in control of 
any interaction taking place with the environment, thus facilitating the implementation 
of a specific privacy policy, for example. 



3.2 RFID Tag Distribution Patterns 

In order to accomplish a large-scale distribution of RFID tags, there are a variety of 
possible tag distribution patterns to choose from. Typical distribution patterns include: 

- Random uniform distribution: Tags are uniformly distributed over a certain area in 
a random manner. 

- Regular distribution: Tags are distributed in a regular pattern, but usually with ran- 
dom tag identification numbers. Typical regular patterns are: 

• Grid pattern: Given a grid with edge length d, each non-border tag has four 
nearest adjacent neighbors at a distance d and four farther adjacent diagonal 
neighbors at a distance sqrt(2) * d. 

• Equilateral triangulation pattern: Each tag has six equidistant neighbors at a 
distance d. 

From the perspective of a mobile object, random tag distribution patterns have in- 
herently different properties compared with regular patterns. One example is illustrated 
in Fig. 2. We expect there to be other generally advantageous patterns for the distribu- 
tion of RFID tags, such as irregular but non-random distribution patterns, for example. 
This calls for the investigation of suitable tag distribution patterns and their respective 
properties as part of future research. 



3.3 Sparse Versus Dense Tag Distributions 

The density of the RFID tag distribution which can be achieved in an SDRI primarily 
depends on the properties of the underlying RFID technology. A secondary aspect is the 
required degree of resolution and the degree of tag redundancy, which can be expressed 
by the average number of tags that are within the range of the mobile reader antenna at 
an arbitrary location, for example. 
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Fig. 2. In an area of randomly and uniformly distributed tags, a mobile reader will come across 
another tag while moving on a straight path with probability p > 0 depending on the tag density 
and the distance traveled. In regular RFID tag distributions, such as a grid structure, it is possible 
that a mobile reader may choose a straight path where it will never encounter a single tag. 



Sparse Non-Overlay Tag Distribution. If the RFID technology used does not support 
collision detection and resolution, only a single tag should be within the range of the 
reader antenna at any given location. We call this sparse tag distribution. In this case, 
the maximum tolerable tag density of the SDRI is limited by the characteristics of the 
available reader antenna. The distributed tags should be spaced in such a way that each 
tag exclusively covers an area (typically larger than the scan range covered by the reader 
antenna). Otherwise, the tags cannot be reliably scanned due to frequently occurring 
collisions. But even if collision resolution is available, one might deliberately prefer a 
non-redundant RFID tag distribution in order to simplify or speed up tag processing. 







Fig. 3. Examples of sparse non-overlay tag distributions: a) random b) grid c) triangular. The 
circles enclosing the tags indicate the idealized area in which a specific tag can be detected. 



A sparse, non-overlapping tag distribution has several disadvantages, however. First, 
it results in an inflexible partitioning of the area covered into coarse-grained cells whose 
dimensions are defined by the range of the mobile reader antenna. Secondly, the typical 
scan range of reader antennas is roughly circular or elliptical. As a consequence, a non- 
overlapping tag distribution would not cover the entire area, but would result in “blind 
spots” at the fringes where no tags could be detected at all (see Fig. 3). Thirdly, a non- 
overlapping tag distribution is not redundant, which means that tag failures cannot be 
compensated for. Even if blind spots at the fringes of non-overlapping detection ranges 
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can be regarded as negligible transition zones and are therefore acceptable, the failure 
of single tags results in larger areas where no tag can be detected at all. 

Dense Overlay Tag Distribution. In order to establish an SDRI with overlapping tag 
scan ranges, so that multiple tags can be detected per scan and per location, the RFID 
system must support collision resolution. In this case, the tolerable tag density is lim- 
ited firstly by the technically viable proximity of tags at which these tags still respond 
correctly during a scan (and are unaffected by tag detuning [9], for instance), and sec- 
ondly by the maximum number of tags within antenna range that can be simultaneously 
detected by the anti-collision scheme in a reasonable time. Both factors depend on the 
particular characteristics of the RFID system. 




Fig. 4. Examples of dense overlay tag distributions: a) random b) grid c) triangular. 



Dense overlay tag distributions allow us to achieve a fine-grained and complete tag 
coverage without “blind spots” (see Fig. 4). Another important advantage is the intro- 
duction of redundancy with respect to the number of RFID tags that are detectable per 
scan at a given physical location. By using an anti-collision system and a sufficiently 
high scan range, we can detect several tags per location. A location could then be de- 
scribed by the set of all the tags that are detectable at the respective place. If we detect 
several tags, we can calculate the center point of their locations and use that value as a 
position approximation for the current location by using localization techniques as pro- 
posed in [5], for instance. This has the benefit of increased robustness with regard to tag 
failures: if a tag is destroyed or fails over time, we still detect the remaining functional 
tags at a location. 

3.4 Tag Typing and Clustering 

Even though each RFID tag in an SDRI has its own unique ID, in certain situations it 
may be sufficient to discern only particular categories or types of tags. For the large-scale 
deployment of RFID tags within a building, for example, one might want to use different 
types of tags in different sections of a building, such as a tag type A for corridors, a type 
B for public spaces, a type C for private areas, and another type D to mark stairways 
and elevators. So in addition to a unique tag ID, each tag could also be equipped with an 
additional data field containing a predefined type identifier. If the RFID tags supported 
physical (or virtual) write access, it would be possible to “impregnate” the desired type 
identifiers onto tags after they had been deployed. Alternatively, tags of a particular type 
could be grouped and pre-packed together for efficient deployment. 
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There are many different fields of application for tag typing and clustering, such 
as marking potentially hazardous areas for visually impaired people who are equipped 
with a smart cane 2 (e.g., in order to warn the person about an approaching stairway), or 
defining different categories of area for mobile robots. 

4 Current Prototypes and Future Work 

So far we have implemented prototypes of two SDRI-based applications (using the 
Hitachi mu-chip [11] and LEGO Mindstorms [15]), which demonstrate the feasibility 
and versatility of our approach. The first prototype is a location-aware autonomous 
vacuum cleaner (equipped with a mobile RFID reader and antenna) which adjusts its 
behavior based on tags embedded in the floor space at its particular location, such as 
avoiding areas that are marked as off-limits or keeping within an area surrounded by 
a virtual barrier (consisting of tags that have been marked accordingly in the teaching 
mode of the mobile robot). 
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Fig. 5. Mobile model vehicle built using LEGO Mindstorms technology. The vehicle motor and 
sensors are operated by the LEGO RCX unit, which receives control commands from or sends 
status information back to the LEGO infrared transmitter which is connected to a laptop computer. 
The RFID reader is also connected to the laptop on which the software for reading tags and 
calculating the tag positions is executed. The bottom view of the vehicle shows the RFID antenna, 
which is mounted approx, two centimeters above the floor, and the two rotation sensors measuring 
the revolutions of the back wheels. At the front of the vehicle, a bumper is connected to a pressure 
sensor to detect collisions while the vehicle is in motion. 



The second is a system for collaborative map making where independent mobile 
vehicles (each again equipped with a mobile RFID reader and antenna) explore a previ- 
ously unknown area (see Fig. 5). Starting from a known position, each vehicle chooses 
a random path through the area and thereby keeps track of the tags encountered and 
the relative inter-tag distances. The separate tag observations are subsequently merged 

2 Goto et al. have developed a smart cane with an integrated RFID reader, for instance. It responds 
to RFID tags which serve as “data-carriers” embedded in the floor at places of interest [10]. 
Here an SDRI could provide a fine-grained and complete distribution of data-carriers over large 
areas, as opposed to the cumbersome process of deploying such data-carriers selectively. 
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by means of an efficient least-squares coordinate transformation algorithm, yielding a 
global map containing the absolute positions of known RFID tags. This map can then 
be used by other mobile devices for navigation and self-positioning purposes. Based on 
this prototype, we plan to implement an SDRI-based positioning system to be integrated 
with a modular probabilistic positioning system [4] which we have already developed. 
For the map-making trials, a sparse random tag distribution on a 0.5 m x 0.5 m floor 
area has been used. The mobile reader antenna covers an area of about 9 cm x 6 cm at a 
distance of about 1 cm above the floor. Using mu-chips equipped with a 4 cm long film 
antenna (inlet), a density of about 120 tags/m 2 has proven to be sufficient for this type 
of application. 

Further experiments are underway to explore different tag distribution patterns and 
tag densities in practice and to determine their influence on scalability, efficiency, and 
robustness. The potential of autonomous vehicles with two or more reader antennas 
will also be explored. Apart from building specific SDRI-based applications for demon- 
stration and more systematic evaluation purposes, we are also looking into ways of 
developing general middleware which will provide an efficient and reliable means of 
accessing an underlying physical SDRI, including fault-tolerant read/write operations, 
an automated maintenance procedure for the “hot” integration of newly distributed tags 
during operation, and high-level services such as location management, self-positioning, 
and local data sharing. We are particularly interested in the issue of robustness and the 
degree of fault-tolerance that can be achieved through massive redundancy of “super- 
distributed” RFID tags. 

Acknowledgements. We wish to acknowledge Svetlana Domnitcheva, Julio Perez, and 
Matthias Sala for implementing the location-aware autonomous smart vacuum cleaner. 
We would also like to thank Marco Bar for his work on the collaborative map-making 
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Abstract. One proposed way to realize the Ami vision is to turn everyday 
objects into artifacts (by adding sensing, computation and communication 
abilities) and then use them as components of Ubiquitous Computing 
(UbiComp) applications within an Ami environment. The (re)configuration of 
associations among these artifacts will enable people to set up their living 
spaces in a way that will serve them best minimizing at the same time the 
required human intervention. During the development and deployment of 
UbiComp applications, a number of key issues arise such as semantic 
interoperability and service discovery. The target of this paper is to show how 
ontologies can be used into UbiComp systems so that to address such issues. 
We support our approach by presenting the ontology that we developed and 
integrated into a framework that supports the composition of UbiComp 
applications. 



1 Introduction 

One proposed way to realize the Ami vision is to turn everyday objects into artifacts 
(by adding sensing, computation and communication abilities) and then use them as 
components of UbiComp applications within Ami environments. The (reconfi- 
guration of associations among these artifacts will enable people to set up their living 
spaces in a way that will serve them best minimizing at the same time the required 
human intervention. A limitation of the current technology is that it requires human 
intervention, in order to enable the communication and collaboration among the 
artifacts. 

Within the context of an architectural framework that supports the composition of 
UbiComp systems the heterogeneity of the devices that constitute the artifact 
“ecologies” is an important parameter. So the feasibility of semantic interoperability 
among heterogeneous devices is a key issue that arises. Also, the dynamic nature of 
UbiComp applications that may lead to unanticipated situations requires the existence 
of a service discovery mechanism. Various types of middleware (based on CORBA, 
Java RMI, SOAP, etc.) have been developed so that to enable the communication 
between different UbiComp devices. However, these middleware have no facilities to 
handle issues like the semantic interoperability among heterogeneous artifacts. 
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In order to handle such issues the primary requirement is to provide to the 
heterogeneous devices a common language that supports the communication and 
collaboration among them. This common language must be based on the description 
and definition of the basic concepts of the artifact “ecologies” and must be both 
flexible and extensible so that new concepts can be added and represented. The target 
of this paper is to present the developed ontology that represents this common 
language and show how this ontology can handle the key issues that arise during the 
development and deployment of UbiComp applications. 

The rest of the paper is organised as follows. Section 2 describes the basic concepts 
of the framework that supports the composition of UbiComp applications, the key 
issues that arise during this procedure and how an ontology can accommodate them. 
Section 3 presents the ontology that was designed and developed and section 4 
introduces the mechanism that was developed for the management of this ontology. 
Section 5 describes the use of this ontology into UbiComp applications through 
examples based on a specific scenario. In section 6 related approaches for ontologies 
in ubiquitous computing environments are presented. The paper closes with the 
lessons learned from our experience, an evaluation of our approach and an outlook on 
future work in section 7. 



2 Key Issues in Ubiquitous Computing Systems 

The Gadgetware Architectural Style (GAS) [5] is an architectural framework that 
supports the composition of UbiComp applications from everyday physical objects 
enhanced with sensing, acting, processing and communication abilities. UbiComp 
applications are dynamic, distinguishable, functional configurations of associated 
artifacts, which communicate and/or collaborate in order to realize a collective 
behavior. Each artifact makes visible its properties, capabilities and services through 
specific interfaces (we’ll sometimes use the term “Plugs”); an association between 
two compatible interfaces is called a “Synapse”. 




eBook 



eLajnp 



Study eGadgetworlc 



\ 

Switch 

Oa'Off 



Fig. 1 . A study UbiComp application realized as a synapse between plugs 

The basic concepts defined above are illustrated in Figure 1 through a simplified 
scenario. Two artifacts (eBook and eLamp) are connected through a synapse which is 
established between two plugs forming a “study” UbiComp application. When the 
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user opens the eBook, the eLamp switches on, adjusting the light conditions to a 
specified luminosity level in order to satisfy the user’s profile. Each artifact operates 
independently of the other. Two plugs are being used in this scenario: “open/close” 
reflecting the state of the book and “switch on/off’ reflecting the state of the lamp. 
Although not obvious, these plugs are compatible. Thus, an end-user can compose a 
simple UbiComp application by establishing a synapse between these two plugs. 
Different users may have access to this “application” each one having defined his own 
study profile. Even during the composition of such a simple UbiComp application a 
number of key issues that must be addressed arise. Following we present a set of such 
issues and explain our decision to use ontologies in order to address them. 



2.1 Semantic Interoperability Among Artifacts 

The composition of UbiComp applications is based on the interaction among devices. 
Since the heterogeneity of these devices is an aspect that cannot be neglected, the 
challenge that we have to handle is the feasibility of semantic interoperability among 
autonomous and heterogeneous devices. In our approach and in order to present the 
autonomous nature and function of each artifact, we chose to base this interaction on 
well-defined and commonly understood concepts, so that the artifacts can 
communicate with each other in a consistent and unambiguous way. So the artifacts 
have to use the same language and a common vocabulary. Note that this common 
language must be flexible and extensible so that new concepts can be added and 
represented. For the representation of this common language we decided to use an 
ontology that describes the semantics of the basic terms of UbiComp applications and 
defines their inter-relations. 



2.2 Dynamic Nature of UbiComp Applications 

One of the most important features of UbiComp applications is that they are created 
in a dynamic way. Users are permitted to create and delete synapses between artifacts 
whenever they want without restrictions. As synapses are associations between two 
compatible plugs, the creation of a synapse requires some form of plugs compatibility 
check. The compatibility of two plugs is determined by several factors e.g. the type of 
input that they accept and the type of output that they produce, that must be 
represented into a formal form. 

The dynamic nature of UbiComp applications depends also on artifacts mobility 
that can cause the dynamic disestablishment of a synapse. For example the 
disestablishment of a synapse may happen when two artifacts move outside of each 
other’s range or when an artifact suddenly “disappears” due to low battery or other 
failure. Since our vision refers to “smart” UbiComp applications and artifacts that 
exploit the knowledge that they have acquired by experience, the desirable solution to 
the “disappearance” of an artifact is to automatically replace it and not to just ignore 
it. In order to ensure artifacts replacement feasibility a mechanism for finding 
“similar” artifacts should be described. We selected to replace an artifact with another 
one that offers the same services. 
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2.3 Semantic Service Discovery 

The plugs are software classes that make visible the artifacts’ capabilities to people 
and to other artifacts. The term that we use for these capabilities is Services. For 
example the artifact “eLamp”, Figure 1 , through the Plug “switch on/off’ provides the 
Service “light”. The concept of services in UbiComp applications is a fundamental 
one, since services can play a major role when determining artifacts’ replaceability 
and plugs’ compatibility; for example we assume that an artifact A participating in 
Synapse S with Plug P can be replaced by another artifact B that provides a service P’ 
similar to P. Furthermore, a user forms synapses seeking to achieve certain service 
configurations; thus a service discovery mechanism is necessary. In UbiComp 
environments this mechanism must be enhanced to provide a semantic service 
discovery; with this term we refer to the possibility to discover all the semantically 
similar services. This mechanism is assisted by a service classification represented 
into the ontology that we developed. 



2.4 Conceptualisation of Ubicomp Applications 

The GAS constitutes a generic framework shared by both artifact designers and users 
for consistently describing, using and reasoning about a family of related UbiComp 
applications. GAS defines the concepts and mechanisms that will allow people to 
define, create UbiComp applications out of artifacts and use them in a consistent and 
intuitive way. As the artifacts are enhanced everyday physical objects, users do not 
have to deal with unfamiliar to them objects, but merely to view their world from a 
different perspective and get familiar with its enhanced concepts. This new world 
view is constituted of a set of basic terms, their definitions and their inter-relations. 
The necessity of capturing and representing this knowledge is evident, as the 
deployment of UbiComp applications is based on this knowledge. Since ontologies 
can conceptualise a world view by capturing the general knowledge and defining the 
basic concepts and their interrelations [15], we decided to use an ontology in order to 
conceptualise the terms of UbiComp applications. The ontology that we developed is 
the GAS Ontology and its first goal was the description of the semantics of the basic 
terms of the UbiComp applications, such as eGadget (our term for artifact). Plug, 
Synapse, Service, eGadgetWorld (our term for UbiComp application), and the 
definition of their interrelations. 



2.5 Context-Awareness 

An important issue of UbiComp environments is the context-awareness, as these 
environments must be able to obtain the current context and adapt their behavior to 
different situations. In UbiComp applications, different kinds of context can be used 
like physical information, e.g. location and time, environmental information, e.g. 
weather and light, personal information, e.g. mood and activity. In our case, the term 
context refers to the physical properties of artifacts including their sensors/actuators 
and to their plugs that provide services; for example the eBook artifact through its 




Using Ontologies to Address Key Issues in Ubiquitous Computing Systems 17 



plug “open/close” provides to other artifacts a kind of context information relative to 
its state. The user, by establishing synapses between plugs, defines the emerging 
behavior of a UbiComp application; e.g. the user with the synapse at the “study” 
UbiComp application, illustrated in Figure 1, defines the eLamp’s behavior in 
proportion to the context provided by the eBook. Thus the UbiComp applications can 
demonstrate different behaviors even with the same context information. 



3 Designing the GAS Ontology 

The ontology that we developed in order to address the aforementioned issues in 
ubiquitous computing systems is the GAS Ontology [2] and is written in 
DAML+OIL. The basic goal of the GAS Ontology is to provide the necessary 
common language for the communication and/or collaboration among the artifacts. 



3.1 Ontology Layers 

The artifacts’ ontology contains the description of the basic concepts of UbiComp 
applications and their inter-relations; for the feasible communication among artifacts 
this knowledge must be common. Additionally an artifact’s ontology must both 
contain artifact’s description; e.g. the description of its plugs and services, and 
represent its acquired knowledge emerged from the synapses that its plugs participate 
to. So the knowledge that each artifact’s ontology represent cannot be the same for all 
artifacts, as it depends on the artifact’s description and on the UbiComp applications 
that the artifacts has participated in the past. 

Since artifact’s interoperability is based on their ontologies, the existence of 
different ontologies could result to inefficient interoperability. An awkward solution 
to this issue could be the merging of all existing artifacts’ ontologies into a global one 
that would inevitably result into a very large knowledge base. This solution is 
undesirable for two reasons; first it does not respect the limited memory capabilities 
of the artifacts and second it would work properly if all artifacts ontologies were 
synchronized. Another solution could be the use of a server into which all artifacts’ 
ontologies are stored and each artifact can have access to it. This solution conflicts 
with the autonomous nature of artifacts. 

The solution that we propose allows each artifact to have a different ontology with 
the condition that all ontologies will be based on a common vocabulary. Specifically 
the GAS Ontology is divided into two layers: the GAS Core Ontology (GAS-CO); 
that contains the common vocabulary, and the GAS Higher Ontology (GAS-HO); that 
represents artifact’s specific knowledge using concept represented into GAS-CO. 

3.2 The GAS Core Ontology (GAS-CO) 

The GAS-CO describes the common language that artifacts use to communicate. So it 
must describe the semantics of the basic terms of UbiComp applications and define 
their inter-relations. It must also contain the service classification in order to support 
the service discovery mechanism. An important feature of the GAS-CO is that it 
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contains only the necessary information for the interoperability of artifacts in order to 
be very small and even artifacts with limited memory capacity may store it. The GAS- 
CO is static and it cannot be changed either from the manufacturer of an artifact or 
from a user. The graphical representation of the GAS-CO is on Figure 2. 




Fig. 2. A graphical representation of GAS-CO 



The core term of GAS is the eGadget (eGt). In GAS-CO the eGt is represented as a 
class, which has a number of properties, like name etc. The notion of plug is 
represented in the GAS-CO as another class, which is divided into two disjoint 
subclasses; the TPlug and the SPlug. The TPlug describes the physical properties of 
the object that is used as an artifact like its shape; note that there is a cardinality 
restriction that an artifact must have exactly one TPlug. On the other hand an SPlug 
represents the artifact capabilities and services; artifacts have an arbitrary number of 
SPlugs. Another GAS-CO class is the synapse that represents a synapse among two 
plugs; a synapse may only appear among two SPlugs. Using the class of eGW the 
GAS-CO can describe the UbiComp applications that are created by the users; an 
eGW is represented by the artifacts that contains and the synapses that compose it. 
The class of eGW has two cardinality constraints; an eGW must contain at least two 
artifacts and a synapse must exist between their SPlugs. 

As an eGt through an SPlug provides a number of services, the GAS-CO contains a 
class for the notion of service. An artifact’s services are close related to what the 
artifact’s actuators/sensors can transmit/perceive. So the class of service is divided 
into subclasses so that to describe a service classification, which is based on the type 
of the signals that an actuator/sensor transmits/perceives. Some elementary forms of 
signals that are described are the following: electric, electromagnetic, gravity, kinetic, 
optic, thermal and sonic. Note that a service can be further refined into higher level 
services: e.g. the optic service can be refined into light, image, etc. Additionally a 
service may have a set of properties; e.g. light can have as properties the color, the 
luminosity, etc. 



Using Ontologies to Address Key Issues in Ubiquitous Computing Systems 19 



3.3 The GAS Higher Ontology (GAS-HO) 

The GAS-HO represents both the description of an artifact and its acquired 
knowledge. These descriptions follow the definitions contained in the GAS-CO. So, 
specifically the knowledge stored into the GAS-HO is represented as instances of the 
classes defined into the GAS-CO. For example the GAS-CO contains the definition of 
the concept SPlug, while the GAS-HO contains the description of a specific SPlug 
represented as an instance of the concept SPlug. Note that the GAS-HO is not a stand- 
alone ontology, as it does not contain the definition of its concepts and their relations. 

Since the GAS-HO represents the private knowledge of each artifact, it is different 
for each artifact. Therefore we can envision GAS-HO as artifact’s private ontology. 
Contrary to GAS-CO, which size is required to be small enough, the size of GAS-HO 
can depend only on artifact’s memory capacity. Obviously GAS-HO is not static and 
it can be changed over time without causing problems to artifacts communication. As 
the GAS-HO contains both static information about the artifact and dynamic 
information emerged from its knowledge and use, we decided to divide it into the 
GAS-HO-static and the GAS-HO-volatile. 

The GAS-HO-static represents the description of an artifact containing information 
about artifact’s plugs, the services that are provided through these plugs, its sensors 
and actuators, as well as its physical characteristics. For example, the GAS-HO-static 
of the “eLamp” artifact contains the knowledge about the physical properties of 
“eLamp”, such as its luminosity, the description of its SPlug “switch on/off’ based on 
the definition provided by GAS-CO, as well as the declaration that the SPlug “switch 
on/off’ provides the service “light”. 

On the other hand the GAS-HO-volatile contains information derived from the 
artifact’s acquired knowledge and its use. Specifically it describes the synapses which 
the artifact’s plugs are connected to. the UbiComp applications which it takes part to, 
as well as information about the capabilities of other artifacts that has acquainted 
through communication. An artifact’s GAS-HO-volatile is updated during the 
artifact’s various activities, like the establishment of a new synapse. 



4 The GAS Ontology Manager 

An artifact in order to participate in our UbiComp applications has to be GAS- 
compatible. An artifact is GAS-compatible if it uses the GAS-Operating System 
(GAS-OS), which is responsible for the communication among artifacts through a 
communication module and the management of synapses through the process 
manager. The GAS Ontology manager is a module of the GAS-OS and provides the 
mechanism for the interaction of an artifact with its stored ontology and the 
management of this ontology. 

One of the most important features of the GAS Ontology manager is that it adds a 
level of abstraction between GAS-OS and the GAS Ontology. This means that only 
the GAS Ontology manager can understand and manipulate the GAS Ontology; the 
GAS-OS can simply query this module for information stored into the GAS Ontology 
without having any knowledge about the ontology language or its structure. Therefore 
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any changes to the GAS Ontology affect only the GAS Ontology Manager and the 
rest of the GAS-OS is isolated from them. 

The GAS-CO must be common for all the artifacts and it cannot be changed during 
the deployment of UbiComp applications. So the GAS Ontology manager provides 
methods that can only query the GAS-CO for knowledge such as the definitions of 
specific concepts like eGadget and Plug and knowledge relevant to the service 
classification. Likewise it can only query the GAS-HO-static of an artifact. On the 
other hand since it is responsible for keeping up to date the GAS-HO-volatile of an 
artifact, it can both read and write to it. As the GAS-HO contains only instances of the 
concepts defined in the GAS-CO, the basic methods of the GAS Ontology manager 
relevant to the GAS-HO can query for an instance and add new ones based on the 
concepts defined in the GAS-CO. So an important feature of the GAS Ontology 
manager is that it enforces the integrity of the instances stored in the GAS-HO with 
respect to the concepts described in GAS-CO. 

The communication among artifacts is initially established using the artifacts’ 
GAS-HO; if their differences obscure the communication, the GAS Ontology 
manager is responsible for the interpretation of GAS-HO based on the common GAS- 
CO. Therefore the communication among artifacts is ensured. Apart from assisting 
the communication among artifacts, the GAS Ontology manager enables knowledge 
exchange among them by sending parts of an artifact’s GAS-HO to another’s. 

One of the GAS Ontology goals is to describe the services that the artifacts provide 
and assist the service discovery mechanism. In order to support this functionality, the 
GAS Ontology manager provides methods that query both the GAS-HO-static and the 
GAS-HO-volatile for the services that an SPlug provides as well as for the SPlug that 
provide a specific service. Thus the GAS Ontology manager provides to the GAS-OS 
the necessary knowledge stored in an artifact’s ontology relevant to the artifact’s 
services, so that to support the service discovery mechanism. Similarly the GAS 
Ontology manager can answer queries for plugs compatibility and artifacts 
replaceability. 



5 Using the GAS Ontology in a Ubiquitous Computing System 

In this section we present an example of how we can use the GAS Ontology into a 
ubiquitous computing environment and the role of the GAS Ontology manager, using 
the scenario for the study UbiComp application illustrated in Figure 1 . According to 
this scenario a user creates its own “study” UbiComp application using two artifacts, 
an eBook and an eLamp. 

In the UbiComp applications the interaction among artifacts is feasible because it is 
based on common concepts and terms. These terms are defined into the GAS-CO. So 
as the GAS-CO provides the artifacts with the necessary common language, both the 
eBook and the eLamp artifacts have stored the same GAS-CO. 

On the other hand the artifacts’ GAS-HO ontologies are different. For example the 
eLamp’ s GAS-HO-static contains information about eLamp’ s SPlug “switch on/off’ 
and the eBook’s GAS-HO-static contains the description of SPlug “open/close”. 
These two artifacts are connected through a synapse which is established between the 
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aforementioned plugs forming the study UbiComp application. So when the user 
opens the eBook, the eLamp switches on, adjusting the light conditions to a specified 
luminosity level in order to satisfy the user’s profile. The knowledge emerged from 
this synapse is stored in the GAS-HO-volatile of both the artifacts that participate to 
it. So the eBook “knows” that it’s SPlug “open/close” participates to a synapse with 
an SPlug that provides the service “light” with a specific luminosity. Note that the 
GAS Ontology manager is responsible for storing knowledge into an artifact’s GAS- 
HO-volatile by using the definitions of the concepts represented in the GAS-CO. 

As the context information that is used in the UbiComp applications describes the 
physical and digital properties of artifacts, it is represented into both the GAS-CO and 
each artifact’s GAS-HO-static. Note that the developer of the ubiquitous computing 
system has access to this information and can change it or add new elements. The 
GAS-HO-volatile of artifacts contains mainly knowledge emerged from the synapses 
that compose an UbiComp application. So this information represents the artifacts’ 
behavior when they get context information through their synapses; these behaviors 
are defined by the user of an UbiComp application. As the GAS Ontology contains 
both context information and the description of the behaviors in proportion to context, 
makes the UbiComp applications context-aware environments. 

If this synapse is broken, for example because of a failure at the eLamp, a new 
artifact having an SPlug that provides the service “light” must be found. The eBook’s 
GAS-OS needs to find another artifact with an SPlug that provides the service “light”. 
The eBook’s GAS-OS is responsible to send a message for service discovery to the 
other artifacts’ GAS-OS that participate to the same UbiComp application. This type 
of message is predefined and contains the type of the requested service and the 
service’s attributes. Note that an artifact may query just for type of service or for a 
service with specific attributes. Above we showed that an artifact can be replaced by 
another one providing similar services. As the GAS Ontology contains both physical 
and digital information for an artifact it is easy to exploit such information in order to 
replace an artifact. 

When the GAS-OS of an artifact receives a service discovery message, it forwards 
it to the artifact’s GAS Ontology manager. Assume that the artifact “eDeskLamp” 
participates to the “study” UbiComp application and that this is the first artifact that 
gets the message for service discovery. The eDeskLamp’ s GAS Ontology Manager 
first queries GAS-HO-static of eDeskLamp in order to find if this artifact has an 
SPlug that provides the service “light”. If we assume that the eDeskLamp has the 
SPlug “LampDimmer” that provides the service light, the GAS Ontology manager 
will send to the eDeskLamp’ s GAS-OS a message with the description of this SPlug. 
If such an SPlug is not provided by the eDeskLamp, the GAS Ontology Manager 
queries the eDeskLamp’s GAS-HO-volatile in order to find if another artifact, with 
which the eDeskLamp has previously collaborated, provides such an SPlug. If the 
GAS Ontology Manager finds into GAS-HO-volatile such an SPlug it sends to the 
artifact’s GAS-OS the description of this SPlug. If the queried artifact, in our example 
the eDeskLamp, has no information about an SPlug that provides the requested 
service, the control is sent back to GAS-OS, which is responsible to send the query 
message for the service discovery to another eGadget. Note that all artifacts have the 
same service classification, which is stored into the GAS-CO; thus the messages for 
service discovery are based on this classification. 
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6 Related Work 

Since the artifacts can be perceived as agents that communicate and collaborate, our 
work is closely related to the field of agent communities. It is widely acknowledged 
that without some shared or common knowledge the members of a multi-agent system 
have little hope of effective communication. The solution that we propose is based on 
the idea that all artifacts have a common ontology, the GAS-CO, and each artifact 
have also a different, “private” ontology, the GAS-HO that is based on the GAS-CO. 
This idea is similar to the one presented in [14], where the communication among the 
agents relies on partially shared ontologies. 

Ontologies have been used in a number of ubiquitous computing infrastructures in 
order to address issues emerged from the composition of ubiquitous computing 
systems. A known use case is the UbiDev [5] a homogeneous middleware that allows 
definition and coordination of services in interactive environment scenarios. In this 
middleware, according to [12], resource classification relies on a set of abstract 
concepts collected in an ontology and the meaning of these concepts is implicitly 
given by classifiers. The main advantage of this approach in facing resource 
management problem is that resources selection is based on their semantics that is 
given by the context. Since every application may have its ontology the application 
structure results separated from the implementation. This approach is different than 
the one that we have followed, because whereas they use an ontology for each 
application that includes several devices, our goal is to provide an ontology that 
facilitates the use of devices in various ad hoc UbiComp applications. 

Ontologies are also integrated in the Smart Spaces framework GAIA [8] [11]. In 
this work the ontologies have been used in order to overcome a number of problems 
in the GAIA Ubiquitous computing framework, such as the interoperability between 
different entities, the discovery and matching and the context-awareness. The 
approach that the GAIA framework follows is fairly different to the one that we have 
proposed for the eGadgets project. Specifically in the GAIA framework there is an 
Ontology Server that maintains the ontologies and there are different kinds of 
ontologies, such as ontologies that have meta-data about the environment’s entities 
and ontologies that describe the environment’ s contextual information. The ontologies 
in GAIA are also used in order to support the deployment of context-aware ubiquitous 
environments [10]. Another approach is the COBRA-ONT [1], an ontology for 
context-aware pervasive computing environments. 

The Task Computing Environment [7] was implemented in order to support the 
task computing that fills the gap between what users really want to do and the 
capabilities of devices and/or services that might be available in their environments. 
This approach is fairly different to ours, since they use the OWL-S so that to describe 
the Web services and the services offered by the devices. 

Finally a very interesting work is the one made by the Semantic Web in UbiComp 
Special Interest Group [13]. The basic goal of this group is to define an ontology to 
support knowledge representation and communication interoperability in building 
pervasive computing applications. This project’s goal is not aimed to construct a 
comprehensive ontology library that would provide vocabularies for all possible 
pervasive applications, but to construct a set of generic ontologies that allow 
developers to define vocabularies for their individual applications. 
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7 Lessons Learned and Future Work 

In this paper we outlined a set of key issues that arise during the composition of 
UbiComp applications and presented how ontologies can be used so that to address 
such issues. We supported our approach by describing the GAS Ontology that we 
developed and integrated into the GAS, a framework that supports the deployment of 
UbiComp applications using the artifacts as building blocks. Although in this paper 
we showed through simple examples the use of the GAS Ontology in UbiComp 
applications, this ontology has been used so that to compose and deploy a number of 
UbiComp applications. Till now we have used this infrastructure in order to build 
UbiComp applications with more that ten artifacts and approximately nine synapses 
between their plugs. 

Additionally we used the GAS Ontology in various demonstrations where non- 
experienced users created their own UbiComp applications, so that to evaluate it in 
demanding situations. For example, during demonstrations users tried to establish 
synapses between incompatible plugs. Such situations were successfully handled from 
the GAS Ontology manager by using the knowledge represented into the ontology so 
that to check the plugs’ compatibility. As these demonstrations went on for many 
hours the disestablishment of synapses due to artifacts’ failure and mobility was a 
frequent event. The infrastructure’s reaction in these events was the discovery of 
artifacts that provide semantically similar services. Although the service discovery 
mechanism always proposed an appropriate artifact, the current version of the GAS 
Ontology is restrictive since it demands all artifacts to have the same service 
classification. This is a limitation that we intent to eliminate by adding to GAS 
Ontology manager the capability to map a service description to another one, using 
the knowledge that artifacts have acquired from their collaboration. 

The design of the GAS Ontology and the approach to divide it into two layers 
resulted to be very helpful for both its development and use. Specifically this 
approach resulted to a small sized GAS-CO allowing the creation of large, extensible 
and flexible GAS-HOs. So it satisfies the demands of long-running and real-time 
UbiComp systems. During the construction of artifacts’ GAS-HO the difficulty that 
we encountered was relevant to the definition of the services that plugs provide. For 
example the eBook’s plug “open/close” reflects its state but it can also be regarded as 
a plug that provides the service “switch”. In order to ease the creation of GAS-HOs, 
one of our goals is to create a graphical interface through which users also can add 
information emerged from their own perception and demands. 

Regarding the issue of context-awareness our infrastructure is on an early stage. 
We believe that the use of plug/synapse model as a context model is sufficient, 
although we need a more elaborate context management and reasoning process. The 
first step is the acquisition of the low-level context; raw data from sensors, and then 
their interpretation to high-level context information. Then artifacts based on their 
context will assess their state and select their appropriate behaviour using a set of 
rules and axioms. The reasoning will be based on the definition of the ontology, 
which may use simple description logic or first-order logic. Finally one of our goals is 
to define a user model so that to handle the existence of various users’ profiles into 
the same UbiComp application. 
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Abstract. This paper investigates the impact of agent migration on the per- 
formance of personal agents in an ambient use case scenario. Four migration 
policies are proposed and a performance model, based on two quantitative per- 
formance metrics, network load and response time for processing the user re- 
quest, is applied to compare the performance of the proposed migration poli- 
cies. Scenario mapping to performance model and analytical results are also 
discussed in this paper. An agent emulator is designed and implemented in the 
Java programming language based on the Jatlite agent framework. The emulator 
is used to produce the experimental results of this study. It has been shown that, 
in this scenario, the policy in which a user agent follows the mobile user in the 
wired side of wireless network lowers the response time when the agent size is 
smaller than the reply size. However, when the reply size is smaller than the 
agent size, a simple stay-at-home policy outperforms the other three policies. 



1 Introduction 

In the converging worlds of mobile telecommunications, broadcasting and computing, 
which promises the provision of information at any time, any place and in any form, 
the end-user and improving his/her experience is of great importance and in a user 
centric approach, having a set of “use case scenarios” is considered vital. Use case 
scenarios help to determine and specify issues related to possible future user needs 
and wishes independently of present technical or economical constraints. They can 
identify involved elements and players and the issues to be solved. In this top-down 
approach we start from user requirements and end up with detailed technical require- 
ments and flows of services and money [1]. In [2] a number of high level motivating 
scenarios have been defined. They cover a range of different possible situations in- 
cluding Home Service, Transportation, Mobile Multimedia, Telemedicine and Mobile 
Commerce. 
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Core 2 Research Programme of the Virtual Centre of Excellence in Mobile & Personal 
Communications, Mobile VCE, ( www.mobilevce.co.uk ) whose funding support is gratefully 
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Members of Mobile VCE. 
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In the realisation of such scenarios, “Software agent technology” is one of the ena- 
bling technologies. Agent technology is able to provide increasingly more services for 
individuals, groups and organisations. Services such as information finding, filtering 
and presentation, service/contract negotiation, and electronic commerce. 

Futuristic scenarios on the one hand and software agent technology on the other 
hand, have been addressed in many papers and research projects; however, little or no 
work has been carried out on performance analysis of future agent-based scenarios. 
Many researchers have investigated the development of agent-based systems, but few 
have proposed an evaluation of the technology through actual measurements in a real 
environment. Most of the literature that studies the performance of agents compares 
the performance of mobile agent systems with traditional client/server systems such 
as [3], In [4] performance issues of a mobile network have been studied by consider- 
ing a mobile agent network as a queuing system where an agent represents an infor- 
mation unit to be served. [5] studied the performance of mobile agent systems in data 
filtering applications. [6] investigated the performance aspects of agent-based Virtual 
Home Environment. [7], [8] and [9] studied the performance of mobile agents in net- 
work management. 

In this paper, the performance evaluation of an ambient use case scenario is dis- 
cussed. This scenario, chosen from [2], addresses the provision of rich data services 
independent of location by using “personal agents”. Personal agents and the issue of 
personal agent migration has not received much attention in the literature, specifically 
in the wireless environment and for the nomadic users discussed in the scenario. 
Therefore in this paper we propose four personal agent migration policies, inspired by 
the very first paper on this issue [10]. Then we map our scenario to a performance 
model presented in [11] and based on this model we analyse and compare perform- 
ance of four agent migration policies in the scenario using two quantitative perform- 
ance metrics. An agent emulator has been used to produce the experimental results of 
this study. This emulator has been designed and implemented in the Java program- 
ming language using the Jatlite agent framework [12] and the analysis and design of 
the emulator is based on GAIA agent-oriented software engineering methodology 
[13]. 

This paper is organised in seven sections. In section 2, the concept of agent migra- 
tion policies and different possible agent migration policies are explained. In Section 
3, the performance model based on two quantitative performance metrics is presented. 
In section 4, scenario mapping to the performance model is described. Section 5 con- 
tains the performance evaluation and comparison of different agent migration poli- 
cies. Finally section 6 concludes this paper and suggests some open issues for further 
research. 



2 Personal Agent Migration Policies in a Use Case Scenario 

Personal or user agents are what the software community has offered for the realisa- 
tion of “Personal Assistants”. “Personal Assistants help users in their day-to-day ac- 
tivities, especially those involving information retrieval, negotiation, or coordination” 
[14]. A personal assistant takes care of users’ schedules and handles routine commu- 
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nication. They might make travel arrangements, shop for the best price and even 
schedule a meeting and then, based on the meeting location, find the cash machine 
with the lowest transaction fee [14]. 

As wireless applications become more available and more nomadic users apply 
wireless applications, the impact of user mobility on personal agents and personal 
agent migration become important issues. A “migration policy” makes the decision 
about the mobility of personal agents. Migration policies can be categorised as either 
active or reactive. When a mobile user moves from one cell to another, a reactive mi- 
gration policy makes decisions about migrating the associated personal agent. How- 
ever, even when the associated mobile user remains in its current cell, an active mi- 
gration policy may migrate the associated personal agent. In this paper we study the 
performance of different reactive type migration policies. Migration policies can also 
be categorised as the always-migrate and the never-migrate policies. An always- 
migrate policy, as the name suggests, migrates the personal agent along with the mo- 
bile user each time the mobile user moves from one base station to another. In this 
policy there is no distance penalty associated with sending messages/data between 
mobile users and their agents. However, an always-migrate policy may lead to load 
imbalances because of a large number of users gather in a cell. In a Never-Migrate 
Policy, the personal agent is located at a fixed location and, as the name suggests, 
never migrates. When the mobile users move, the distance between the mobile users 
and their agents may become farther and the messages/data may have to flow long 
distances. This may lead to inefficient use of network bandwidth [10]. 

As we can see, neither of these types of policies seems optimum. But there could 
be a third type of policy that is closer to the optimum solution. In this policy personal 
agents decide intelligently in each move, whether to stay or migrate in order to im- 
prove their performance in comparison with never-migrate and always-migrate poli- 
cies. We call these sometimes-migrate policies. There are different policies for mak- 
ing the decision when to migrate and when to stay. Count and distance policies are 
two examples of these type of policies. In count policy the maximum number of 
agents on a given processor is limited while in distance policy, agents that are further 
away from their users, get a higher priority for migration. Both count and distance 
policies are threshold-based policies [10]. 

In this paper and for the purpose of performance analysis of the given scenario, 
four migration policies have been proposed based on the main types explained above. 
Two policies can be considered as never-migrate policies and the other two are some- 
times-migrate policies. However, unlike count and distance policies, the decision to 
migrate is not a threshold-based decision. In the first policy the personal agent stays at 
the user’s home. In the second policy migrations happen from home to a few well- 
known locations, such as the office and the user’s mobile terminal. In the third policy 
copies of the personal agent are located at well-known locations mentioned above. In 
the fourth policy, the personal agent moves within the fixed part of wireless network. 
All these four policies are explained in the following sections. 




28 



E. Homayounvala and A.H. Aghvami 



2.1 Policy One 

In a never-migrate policy, the personal agent is assigned to a fixed processor at a 
fixed location and this fixed location could be user’s home. A personal agent located 
at a home network serves the user via various devices at home from television and 
appliances in the kitchen to personal computers and printers. This agent can perceive 
the state of the home through sensors and act on the environment through effectors. 
When the user is not physically at home, all personalised services could be managed 
by a connection establishment to the personal agent at home. 



2.2 Policy Two 

A personal agent in policy two migrates from home to a few well-known locations 
such as the user’s mobile terminal, car or the office. This is a sometimes-migrate pol- 
icy. In this case, the personal agent migrates from home to the user’s mobile terminal 
and to the office and vice versa. Thus migration happens between home workstation, 
mobile terminal and the office workstation. 



2.3 Policy Three 

Policy two and three are very similar. Generally when a personal agent migrates from 
one host to another host, it transfers its code, state and data to the destination. But the 
cost of transferring the agent code can be saved by cloning the personal agent. In 
other words there could be copies of the personal agent in the mobile terminal and the 
office workstation as well as the home workstation. Unlike policy two, in which there 
is one personal mobile agent, there are three static agents as personal assistants in 
policy three. The Residential Personal Assistant, Personal Mobile Assistant and Of- 
fice Assistant assist the user at home, outdoors and in the office, respectively. 



2.4 Policy Four 

There are various differences between wireless and wired connection, among them is 
that wireless has more limited resources such as bandwidth and power. In addition to 
that, the cost of a wireless connection is much higher than a similar wired connection. 
When the user is outdoors the connection between the personal agent, located in the 
car or mobile terminal, to the core network is power and bandwidth consuming. 
Therefore, in order to save power and bandwidth, the personal agent can migrate from 
the home to the wired side of the wireless network instead of the mobile terminal. The 
mobile terminal needs to connect just when receiving the result. This policy is par- 
ticularly useful when the mobile terminal is not capable of running the agents. Per- 
sonal agents can also play the role of an intermediary server, by migrating to the fixed 
network, which has access to high bandwidth links and large computational resources. 
In this policy personal agents should migrate from host to host in the fixed network 
every time the user moves. 
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3 Performance Modelling 



Each user scenario consists of sequences of interactions between different entities in- 
cluding users, agents, service and content providers, etc. Therefore in order to be able 
to evaluate the performance of a scenario, one must present a performance model for 
a sequence of interactions and consequently for a single interaction. If the agent is to 
reason and decide whether to migrate or stay in any single interaction, it is necessary 
to be able to evaluate both migrate and stay cases and compare them. For the purpose 
of this study two quantitative performance metrics, network load and the average re- 
sponse time for processing user requests, have been analysed in a performance model. 
The goal of a personal agent’s decision-making is to keep the value of these two met- 
rics as low as possible. Reduction of network load reduces bandwidth requirements, 
which is extremely important in wireless links and reduction of response time to ac- 
ceptable levels is vital to keep users happy. Performance of a sequence of interactions 
in terms of network load and response time for a mixed sequence of message passing 
and agent migrations can be written as follows! 1 1 ]: 



(.S,D,B A ') = £ (B Mis (D i _ l ,D t ,B A ^) + m l B RPC (D i ,R i ,B r 



( 1 ) 



T >eq (S,D,B Aa ) = £ ( T m {D l _ 1 ,D l ,B Ai _ i )+m, T ^ (D B _ .B ^ , )) 



( 2 ) 



In this model if 5=(/; is a sequence of interactions, the i-th interaction is de- 

scribed by: I = \r m B B (7 r where R, is the remote location with 

which the communication should take place. In each communication /n, (local or re- 
mote) messages with request size B req , reply size B rep and selectivity a, ■ are sent. The 

agent size after interaction i is: B = (B B B ) where i=0 n .The des- 

tination vector D = (D 0 ,...,D n ) describes the mobility behaviour of the agent. For the 
;-th interaction, the agent moves to destination D,. The network load B RPC (in bytes) 
for a message passing from location to location L 2 is B req + B rep . The corresponding 
transfer time T rpc for sending a message and receiving the reply from location L t to 
location L 2 is: 



T RPC (L l ,L 2 ,B l 



rM = 2£(A,F 2 ) + 



1 



t(L 1 ,L 2 ) 



+2/j 



1 B R ■ '! ( R\ R : B B ) 

' RPC' ] ' 2 ' req ’ reps 



( 3 ) 



In the calculation of the Transfer time, the time for marshalling 2 and unmarshalling of 
request and reply (factor //), and the time for the transfer of the data on a network with 
throughput r (L h L 2 ) and delay 8(L\ ,L 2 ) have been considered. 

In the agent migration case, the agent first moves itself to the vicinity of the commu- 
nication peer and then uses message passing locally, eliminating the need for message 
passing, especially over the wireless link. However, the agent should be transferred 



2 “Marshalling” refers to the process of taking arbitrary data (characters, integers, structures) 
and packing them up for transmission across the network. 
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over the link. The network load for the migration of an agent A from location E, to lo- 
cation L 2 is calculated by: 



J° if L x = L 2 ( 4 ) 

B Mig (LlA,B A ,(7,B rep )= | (B cr +B code )P+B Jala +B slate {\ + CT)B rep Otherwise 



where B cr is the size of the request from L 2 to Li to transfer the code and P is the 
probability that the code is not yet available at location L 2 . B rep is the size of the reply, 
and (^represents the selectivity of an agent, that is how much the B rep is reduced by 
remote processing. The corresponding execution time for a single agent migration 
from location Lj to location L 2 amounts to: 



T Mig ^,L 2 ,B A )=(\+2P)S(L l ,L 2 )+ 



Tih'k) 



0 if L 1 — L 2 ^ 

2 Pl B Ja,a + B s,J Otherwise 



4 Scenario Mapping to Performance Model 

In order to apply the performance model described in the previous section to evaluate 
the use case scenario, sequences of interactions in the scenario should be identified 
and proper values for the parameters defined in the mathematical model have to be 
assigned. Our use case scenario, which is described in [2], starts from user’s home. 
User (John) asks his Residential Personal Assistant (RPA) to deliver world top news. 
While he moves from one room to the other room in the house, the RPA detects his 
movements and transfers the data to devices in the new location. The scenario contin- 
ues in John’s car where he exchanges data with his friend (Paul). John’s PMA (Per- 
sonal Mobile Assistant) is responsible for personalised assistant since he left home to 
go to his office and helps John to transfer data to Paul’s PDA and find information 
about the intended rout. Figure 1 models part of the use case scenario to show how we 
can apply the performance model. L 0 is John’s workstation at his home, L 3 and L 4 are 
John’s and his friend’s PDAs respectively, both are located in his car, and L s is his of- 
fice workstation. The sequence of interactions starts at home in the kitchen (L 0 ) where 
John asks for a videoconference, then while talking he leaves home and his RPA 
transfers his call to his mobile terminal (L 3 ). After finishing his call he requests trans- 
fer of his business data to his friend’s mobile terminal ( L 4 ), which is also in the car, 
and as they enter the office, his workstation in the office (L 5 ) is updated by business 
related data. Li and L 2 are some hosts in the fixed network, which play the role of 
proxies for providing news and videoconference services and L f) , and Lq are the actual 
content providers for news and videoconferencing respectively. 

The interaction vector is: R = [L 0 , L h L 2 , L 3 , L 2 , L 4 , L 5 ], However an element 
should be added to the mobility vector to take into account the location of the user, 
which is not always the location of the agent. Hence the Interaction vector for this 
part of the scenario is: R = \L {] , L h L 2 , L 3 , L 2 , L 2 , L 4 , L 5 ], 
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Fig. 1. Part of the use-case scenario 

The sequence of interactions (.S'|) to model this part of the scenario should be defined 
by assuming values for B req , B rep and for interaction 1 to 8. Additionally agent code 
and state size should be assigned proper values and the size of the message to request 
agent code ( B cr ) and the throughput matrix should be defined. Different migration 
policies can be distinguished in this model, by different values of P (probability that 
the code is not available at the destination), and D (mobility vector), presented in Ta- 
ble 1. Since the L\-L b and L 2 -L- interaction have the same characteristics in all migra- 
tion policies, there is no need to consider these interactions in the performance analy- 
sis. 



Table 1. Mobility vector for policy one to four 



# 


Policy 


Destination Vector D 


p 


1 


One 


Lo , Lq , Lq , Lq , Lq , Lq , Lo , Lq 


1 


2 


Two 


Lo , Lo , Lo , L3 , L3 , L3 , L3 , L5 


1 


3 


Three 


Lo , Lo , Lo , L3 , L3 , L3 , L3 , L5 


0 


4 


Four 


Lq , Lo , Lo , L2 , L2 , L2 , L2 , L5 


1 



5 Performance Evaluation 

In this section, we concentrate on the performance evaluation of the four agent mi- 
gration policies proposed in section 2 by applying the performance model presented 
in section 3 and the mapping explained in section 4. First analytical results are shown 
and then detailed emulator architecture and experimental results are explained. 



5.1 Analytical Results 

Figure 2 shows part of the scenario, and in Table 2, the corresponding sequence of 
interactions ( S 2 ) is shown. In this part of the scenario the user asks for a service when 
at home (Lo) and three other services while in the car (L 2 ). L\, L 3 , L 4 and L 5 are proxies 
for each service and L 6 , L 7 , L g and L 9 are the actual content providers. In interaction 
Lq-L u the user asks for world news, interaction L 0 -L 2 might contain agent movement 
depending on the migration policy, therefore the sizes of request and reply have been 
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considered as relatively small and the data size for agent migration is assumed to be 
150KB. This data size includes the user profile and the last changes the user has made 
in his calendar and diary. The agent state has been assumed to be zero and B cr is 1KB. 




Fig. 2. Part of the use-case scenario 



Table 2. Sequence of interactions S2 



i 


Ri 


Breqi{ KB) 


B r U KB) 


^(KB) 


i 


Lo 


0 


0 


0 


2 


Li 


50 


0-2000 


1 


3 


L2 


1 


1 


150 


4 


L3 


30 


0-2000 


1 


5 


u 


20 


0-2000 


1 


6 


L5 


10 


0-2000 


1 


7 


L2 


15 


0-2000 


1 



The interaction of L 2 with L ( , L 4 and L 5 could be asking for intended route informa- 
tion, a videoconference or any other type of services. The interaction vector (R), re- 
quest and reply sizes for each interaction can be found in Table 2. In the throughput 
matrix, the L 0 -L\ connection is assumed to be established by a home broadband con- 
nection with 200 Kbps throughput, the connection between L 2 and L 2 -L 5 is a wireless 
connection with 144 Kbps throughput, and the connection between L 2 -L 5 and Li-L, is 
a wired connection with 3.2 Mbps throughput. 

Because of the importance of the agent size in the performance of migration poli- 
cies, we have analysed the performance of these policies for different values of agent 
code size. Figure 3 illustrates how the network load and response time of sequence S 2 , 
vary when the reply size varies between 0 and 2 MB, and the agent size varies be- 
tween 0 and 100 KB. In these cases the selectivity factor is assumed to be 0.5. As we 
can see in Figures 3, when agent size is smaller than reply size, policy four is the best 
policy. On the other hand, when reply size is smaller than agent size, policy one out- 
performs the other three. In policy four, we save the mobile terminal precious re- 
sources by sending the personal agent to the fixed part of the wireless network. How- 
ever the agent code, data and state have to transfer via the wireless link. In policy one, 
which is a never-migrate policy, the personal agent is located at the user’s home work 
station and faces fewer threats from insecure hosts. It is therefore probably the most 
secure policy. 



Performance Evaluation of Personal Agent Migration Policies 



33 





Resjiurrse Trre(s) 

g:o v • • " 
fi:n- • 
no ■ ■ ■ 
o:0'- • " 



Fig. 3. Network Load and Response Time as functions of agent size and B rep i y 

Two-dimensional graphs of Figure 3 are presented in Figure 4. In these diagrams the 
reply size is assumed to be 1 MB and because there is no agent migration in policy 
one and three, they have constant values of network load and response time for all 
agent sizes. It can be seen that, policy four has the greatest network load, especially 
for large agent sizes. Policy two has the minimum network load among the four poli- 
cies. However, considering the differences in connection types or throughput, policy 
four has the lowest response time. In this case the cost of agent migration is worth its 
benefits for a large reply size. Policy two, in which the personal agent is always with 
the user even in the wireless terminal, is the worst for the assumed reply size. Policy 
one, the stay-at-home policy, is better than this policy for agent sizes more than 
800KB. Hence, in the case of large reply size, it is better not to transfer the personal 
agent to the mobile terminal. Cloning the personal agent, policy three, makes the re- 
sponse time lower and offers a good response time, compared to policies one and two. 



f 

■3 





Fig. 4. Network Load and Response Time as functions of agent code size (two-dimensional) 



5.2 Emulator Architecture and Experimental Results 

In order to evaluate the policies, their simulation is not enough and the performance 
must be validated in an environment as close as possible to that of the real world. That 
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is why an emulator is envisaged. The implemented software emulates the use-case 
scenario and can be applied for experimental results. The detailed analysis and design 
of the software using an agent-oriented software engineering methodology, GAIA, is 
described in Mobile VCE technical reports. Jatlite is used as the agent framework for 
the implementation of the software [12]. The general architecture of the software is 
shown in Figure 5. Personal Agent (PA) is the user’s personalised agent, which is re- 
sponsible for various actions from diary and calendar management to filtering Internet 
contents and arranging meetings. The Location Detector Agent (LDA) recognises user 
movement through sensors and reports it to the user. User Agent (UA) has been im- 
plemented to simulate the user and this agent is fed by a user request file. The emula- 
tor starts by activating these three agents and after that they start sending and receiv- 
ing messages to each other. The scenario is distributed between five main files: user 
calendar, user profile, user requests, user movements and location of terminals. By 
changing the content of these files we can feed another use-case scenario to our emu- 
lator. 




Fig. 5. Emulator architecture 

The message passing mechanism between these agents is through the Jatlite AMR 
(Agent Message Router) mechanism, so messages can be buffered if the agent is dis- 
connected and be delivered to the destination agent when it is connected. Since all 
messages should go through the router, the response time for different connection 
types can be measured by running the router on one computer and the agents on an- 
other computer. As an example response time can be measured for sending a message 
from Agent 1 to Agent 2, both running on the same computer, and receiving the reply 
back via the router running on another computer with dial-up connection to the first 
computer. A faster type of connection can be emulated by applying a Local Area 
Network for sending and receiving messages from the router to the agents. Response 
time is measured for different message sizes. Having these experimental results for 
single interactions, such as the L 0 -L\ interaction in the Si or S 2 sequence, with differ- 
ent message sizes, we can replace mathematical equations in our performance model 
by experimental results. Although experimental results are specifically for single in- 
teractions in the message passing case, we can estimate agent transfer time based on 
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the measured results. In these measurements agent code size, reply size and selectivity 
factor are assumed to be 18KB, 15 KB and 0.5 respectively. Network delay and agent 
state size are assumed to be zero. We can see, in Table 3 that experimental results 
validate the analytical analysis of the scenario and that policy four is the best policy in 
terms of response time with the assumptions specified. 



Table 3. Comparison of analytical and experimental results 



Policy 


Mobility Vector D 


T S ec, (Analytical) 


T secj (. Experimental ) 


One 


Lo , Lo , Lo , Lo , Lo , Lo , Lo 


149.9469 


171.9 


Two 


Lo , Lo , L >2 , L >2 , L 2 , L >2 , L >2 


144.4602 


143.4800 


Three 


Lo , Lo , L 2 , L 2 , L 2 , L 2 , L 2 


141.0973 


131.5100 


Four 


Lo , Lo , L 3 , L 3 , L 4 , L 5 , L 5 


74.1927 


68.7650 



6 Conclusion 

This paper has focused on investigating the impact of user agent migration on the per- 
formance of an ambient use-case scenario. Based on different types of agent migra- 
tion policies that can be categorised in three groups of never-migrate, always-migrate 
and sometime-migrate policies, four migration policies have been proposed. A Per- 
formance model based on two quantitative performance metrics, network load and the 
average response time for processing the user request, has been applied to compare 
the performance of the proposed migration policies. Scenario mapping to performance 
model and analytical results have also been discussed. An agent emulator has been 
designed and implemented in the Java programming language based on the Jatlite 
agent framework. The emulator has been used to produce the experimental results of 
this study. 

As the results presented indicate when agent size is smaller than reply size, policy 
four, in which the personal agent follows the mobile user in the fixed part of the 
wireless network, lowers the response time and promises to be even more beneficial 
in a realistic measurements. However when the reply size is smaller than agent size, a 
simple stay-at-home policy outperforms the other three. Although these results are 
valid for the specific scenario discussed here, the emulator designed and implemented 
in this study has the potential to evaluate any set of user agent migration policies for 
any use case scenario. The intention of this paper is to show how a personal agent’s 
decision to migrate or stay can affect the performance in terms of network load and 
response time, rather than introducing one migration policy as the policy with the best 
performance in general. 

This paper shows that for those personal agents who provide ambient intelligence 
for users, the decision to stay or migrate has a great impact on their performance. 
Therefore reasoning algorithms and techniques for migration are crucial. The main 
challenge of this article is to provide a sample of research that generates valuable data 
for learning phase of personal agents to help them to reason and make decisions about 
their migration in order to perform more efficiently. Therefore an open issue that 
could be followed after this study is how to equip personal agents with a decision- 
making mechanism to improve the performance of service delivery to the end-user. 
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Abstract. With the rapid growth of emerging demand and deployment 
of wireless LAN (WLAN), much of traffic including multimedia traffic 
is forced to travel over WLAN. Since the legacy IEEE 802.11 standard 
is unable to provide adequate quality of service (QoS) for multimedia 
traffic, IEEE 802. lie group developed medium access protocol (MAC) 
improvements to support QoS sensitive applications and to make more 
efficient use of the wireless channel. Although this QoS enhancement is 
great improvement for the legacy standard in terms of capability and 
efficiency of the protocol, it does not address the method to enhance 
QoS when a wireless station (STA) changes its associated access point 
(AP) due to its mobility. Thus, in this paper, we propose an effective QoS 
provision method that can guarantee QoS requirements in WLAN when 
the STA performs handoff and is turned on. In order to guarantee QoS 
requirements, the proposed method uses APs with dual radio frequency 
(RF) modules. By adding to AP an RF module which can only receive 
signals (SNIFFER), the AP can eavesdrop channels of its neighboring 
APs selected by the modified neighbor graph (NG). Experimental results 
show that the proposed mechanisms can guarantee QoS for multimedia 
traffic without ping-pong phenomenon by reducing the link layer (L2) 
handoff delay drastically and distributing the system loads. 



1 Introduction 

With the explosive growth of wireless Internet and other WLAN applications, 
it has become more and more important to improve the spectral efficiency and 
throughput of WLAN systems. Since the MAC protocol plays a critical role in 
determining the spectral efficiency of the system, IEEE 802.11 WLAN MAC [1] 
protocol has been intensely analyzed and various mechanisms have been pro- 
posed for performance enhancement [2,3]. And in order to achieve the maximum 
coverage and throughput of the overall WLAN system, researches on the place- 
ment and channel assignment of an AP in WLAN has been studied. In [4,5], an 
approach of optimizing the placement and channel assignment of the AP by for- 
mulating an optimal integer linear programming (IPL) problem was proposed. 
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Kamenetsky proposed a combination of the two approaches using pruning for 
obtaining an initial set of AP positions and refining the initial set by the neigh- 
borhood search or the simulated annealing [6]. 

There are at least two kinds of applications that are expected to take ad- 
vantage of the success of the IEEE 802.11, namely AV transmission in home 
networks and Voice over IP (VoIP). Both applications need very tight QoS sup- 
port. Since the legacy IEEE 802.11 standard was not able to provide adequate 
QoS for these applications, a new task group (TGe) in IEEE 802.11 was initiated 
to enhance IEEE 802.11 MAC layer. TGe focused on the mechanisms to enhance 
QoS when an STA is not moving. However, it is also important to support mo- 
bile QoS in order to enable a better mobile user experience and to make more 
efficient use of the wireless channel. 

In order to provide QoS during movement, we propose two mechanisms: effec- 
tive initial association method and fast handoff method based on QoS awareness. 
Using the proposed effective initial association, the STA can select the proper 
AP that provides adequate QoS when the STA is turned on. And the STA 
can change the associated AP without disconnection time and ping-pong phe- 
nomenon caused by improper selection of AP using the proposed fast handoff. 
The proposed methods are based on the modified NG [7] and utilize dual RF 
modules. 

This paper is organized as follows. Section 2 reviews the handoff procedure 
in IEEE 802.11 and the NG before introducing the proposed mechanisms. We 
describe the proposed mechanisms in Section 3. Section 4 shows the experimental 
results and presents brief conclusion comments. 

2 Background 

Our proposed methods are based on the conventional handoff procedure in IEEE 
802.11 and utilize the NG. Thus, before introducing the proposed methods, we 
briefly review the handoff procedure in IEEE 802.11 and the concept of the NG. 

2.1 IEEE 802.11 Handoff 

An STA continuously monitors the signal strength and link quality from the 
associated AP. If the signal strength is too low, the STA scans all the channels 
to find a neighboring AP that produces a stronger signal. By switching to another 
AP referred to as handoff, the STA can distribute the load condition and increase 
the performance of other STAs. 

The complete handoff procedure can be divided into three distinct logical 
phases: scanning, authentication, and reassociation. During the first phase, an 
STA scans for APs by either sending ProbeRequest messages (Active Scanning) 
or listening for Beacon messages (Passive Scanning). After scanning all the chan- 
nels, the STA selects an AP using the received signal strength indication (RSSI), 
link quality, and etc. The STA exchanges IEEE 802.11 authentication messages 
with the selected AP. Finally, if the AP authenticates the STA, an association 
moves from an old AP to a new AP as following steps: 
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STA 



All APs within range on all channel 




Fig. 1. IEEE 802.11 handoff procedure with IAPP 



(1) An STA issues a ReassociationRequest message to a new AP. The new AP 
must communicate with the old AP to confirm that a previous association 
existed; 

(2) The new AP processes the ReassociationRequest ; 

(3) The new AP contacts the old AP to finish the reassociation procedure with 
IAPP [8]; 

(4) The old AP sends any buffered frames for the STA to the new AP; 

(5) The new AP begins processing frames for the STA. 

The delay incurred during these three phases is referred to as the L2 handoff 
delay, that consists of probe delay, authentication delay, and reassociation delay. 
And Since the radio link is disconnected during the L2 handoff, shortening the 
handoff delay is crucial to real-time multimedia service such as VoIP. Figure 1 
shows the three phases, delays, and messages exchanged in each phase. 



2.2 Concept of the NG 

In this subsection, we describe the notion and motivation for the NG, and the 
abstractions they provide. Given a wireless network, the NG containing the 
reassociation relationship is constructed [7]. 

Reassociation Relationship: Two APs, apt and apj , are said to have a reas- 
sociation relationship if it is possible for an STA to perform an IEEE 802.11 
reassociation through some path of motion between the physical locations of 
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Fig. 2. Concept of the NG. (a) Placement of APs (b) Corresponding NG 



api and apj. The reassociation relationship depends on the placement of APs, 
signal strength, and other topological factors and in most cases corresponds to 
the physical distance (vicinity) between the APs. 

Data Structure (NG): Define an undirected graph G = (V,E) where V = 
{ap\, ap 2 , ■ ■ • , ap n } is the set of all APs constituting the wireless network. And 
the set E includes all existing edges e^-’s where e,j = ( app,apj ) represents the 
reassociation relationship. There is an edge eij between apt and apj if they have 
a reassociation relationship. Define N(api) = {apt k : api k £ V, e,:*, £ E}, i.e. , the 
set of all neighbors of api in G. 

The NG can be implemented either in a centralized or a distributed manner. 
In this paper, the NG is implemented in a centralized fashion, with correspon- 
dent node (CN) storing all the NG data structure (see Fig. 8). The NG can 
be automatically generated by the following algorithm with the management 
message of IEEE 802.11. 

(1) If an STA associated with APj sends Reassociate Request to APi, then add 
an element to both N(api) and N(apj) (i.e. an entry in APi , for j and vice 
versa) ; 

(2) If eij is not included in E, then create new edge. The creation of a new 
edge requires longer time and can be regard as ‘ high latency handoff' . This 
occurs only once per edge. 

The NG proposed in [7] uses the topological information on APs. Our pro- 
posed algorithm, however, requires channels of APs as well as topological infor- 
mation. Thus, we modify the data structure of NG as follows: 

G' = (V',E), 

V' = {Vi : Vi = (api, channel), Vi £ V}, . . 

e^ = (api, apj), 

N(api) = {ap ik : ap ik £ V',e ik £ E}, 

where G' is the modified NG, and V' is the set which consists of APs and their 
channels. In Section 3, we develop a fast handoff algorithm based on the modified 
NG. 
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Effective Initial Association Based on QoS awareness 




Fast Handoff Based on QoS 



Fig. 3. Block diagram of the proposed method for QoS provision 



3 Proposed Mechanisms 

An association between the AP and the STA is established in the following two 
cases. First, when the STA is turned on, it tries to establish an association with 
an AP so that it can access WLAN. Second, when the STA moves to another 
AP because of mobility, the STA establishes a new association with another AP. 
Thus, in this paper, we propose two methods for QoS provision: effective initial 
association method and fast handoff method based on QoS awareness. Figure 3 
shows the block diagram of the proposed methods. 




RFj : Receive :iml T ransmit Signals 



(a) 



KKj: Receive and Transmit Signals 
RT 2 : Only Receive Signals (SNIFFER) 

(b) 



Fig. 4. (a) General AP with single RF module (b) Proposed AP with dual RF modules 
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3.1 Proposed AP with Dual RF Modules 

In general, an AP contains an RF module which can receive and transmit signals 
by turns in the assigned channel (see Fig. 4 (a)). By adding an RF module which 
can only receive signals to the AP (SNIFFER), i.e., the AP has two RF modules, 
the AP can eavesdrop channels of its neighboring APs selected by NG. Figure 4 
(b) shows the proposed architecture of the AP. If the STA enters the cell range 
of a new AP, the SNIFFER of the new AP can eavesdrop the MAC frame. Thus, 
by examining the MAC frame of incoming STA, the new AP can get the address 
of the AP associated with the STA. 
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(b) 



Fig. 5. Selecting channel frequencies for APs. (a) Channel overlap for 802.11b APs (b) 
Example of channel allocation 



As shown in Fig. 5 (a), the basic service set (BSS) can use only three channels 
at the same site because of interference between adjacent APs. For example, in 
the USA, channels 1, 6, and 11 are used. If the SNIFFER knows the channels of 
neighboring APs, it does not need to receive frames in all channels. In Fig. 6, for 
example, the SNIFFER of the AP3 receives frames in channels 1 and 6 selected 
by the modified NG. Assume that the destination address in the received frame 
is the address of AP2. Then, AP3 informs its neighboring AP (AP2) that the 
STA associated with neighboring AP (AP2) enters the cell range of AP3. 

The proposed AP does not require any change of the STA. Existing wireless 
network interface card (WNIC) does not require extra devices but can be serviced 
simply by upgrading software. This is very important factor to the vender. 

3.2 Effective Initial Association Based on QoS Awareness 

As we discussed before, after an STA is turned on, it starts to collect the infor- 
mation on channels by receiving Beacon frames in order to determine the AP to 
be associated with. However, the information obtained from Beacon frames is 
not enough to support various requirements of STAs because the Beacon frame 
does not provide the QoS information on the AP. Most multimedia applications 
require high throughput and short delay. Especially, high quality VOD services 
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need high throughput although the delay can be compensated by the buffering 
schemes. And the short delay has to be supported for VoIP services because of 
the characteristics of real-time interactive multimedia services. 




(1) Move (2) Move Detection 

(3) Notify the STA's Movement (4) Inform the Information on the Neighboring AP 
(5) Handoff Initiation Request (6) Handoff Initiation Response 
(7) Handoff Execution 



Fig. 6. Fast handoff based on QoS awareness 



Therefore, we introduce a method of adding two QoS parameters including 
the available throughput and average contention period of AP into the Beacon 
frame. The contention period is a time interval between packet arrival at the 
MAC layer and transmission of the packet to the destination. Note that an 
average contention period covers the collision, back-off, and listening periods in 
which another STA captures the channel. The intended sender would be a non- 
participant STA during the listening period until the channel is idle again. Using 
the available throughput of APs, the STAs that demand the high throughput 
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due to the high quality VOD service can select the AP which can provide an 
enough available throughput. And the STAs that need the VoIP service can be 
associate with the AP supporting a short contention period. 

3.3 Fast Handoff Based on QoS Awareness 

Before handoff, the STA or network should first decide handoff depending on a 
certain policy. Handoff policies can be classified into three categories [9] : network 
controlled handoff, network controlled and STA assisted handoff; STA-controlled 
handoff. In IEEE 802.11 WLAN, handoff is entirely controlled by the STA. 



STA 



Old AP Neighboring APs 




Initialize STA 
(L2 trigger) 



Fig. 7. Exchanged messages using the proposed handoff algorithm 



In general, when the STA performs handoff, it selects the AP that provides 
the strongest signal strength regardless of the QoS requirement of its appli- 
cation. Thus, after being associated with the AP, the ping-pong phenomenon 
would occur if the AP can not support the QoS requirement of the application. 
This phenomenon is crucial, especially to the real-time multimedia service that 
requires the low handoff delay. 

Therefore, in this paper, we propose a new type of fast handoff algorithm 
that is STA-controlled and network assisted. The STA appends the required 
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QoS information to the packet to be transmitted. By eavesdropping neighboring 
channels, SNIFFERs are able to collect the QoS information of the STAs. Then, 
network encourages the STA to lrandoff by providing it with information on 
the new AP. And the STA decides whether the network situation matches the 
lrandoff criterion. If it matches the handoff criterion, the STA performs lrandoff 
according to the received information. Figure 7 shows the proposed handoff 
procedure: 

(1) The STA associated with AP2 moves toward AP3; 

(2) The SNIFFERs of API and AP3 receive the STA’s MAC frames; 

( 3 ) Using the destination address in the MAC frame of the STA, SNIFFERs 
of API and AP3 can know that the STA has been associated with AP2. 
Let us denote the service requirement of the STA and the QoS information 
of each AP. If the QoSap 3 is proper, the SNIFFER of the AP3 sends AP2 
a message including handoff information such as the measured RSSI, the 
MAC address of the STA, and so on; 

(4) As the wireless medium is very scarce resource, AP2 must not forward all 
messages from other APs to the STA. After removing redundant messages, 
AP2 relays messages to the STA; 

( 5 ) The STA decides handoff according to the received message. After deciding 
handoff, the STA sends a handoff initiation message to AP3 through AP2. 
If the STA does not need handoff, it does not issue a handoff initiation 
message; 

(6) AP3 sends a response message to the STA. In this phase, AP3 can supply 
L2 triggers to L3; 

( 7 ) The STA performs handoff from AP2 to AP3. 

As discussed in Section 2.1, the L2 handoff delay consists of scanning, au- 
thentication, and reassociation delays. Mishra [10] shows that the scanning delay 
is dominant among the three delays. However, the proposed handoff algorithm 
deletes the scanning phase by using the SNIFFER. And the STA can be authen- 
ticated and associated before handoff [11]. If network allows this preprocess, the 
L2 handoff delay can be eliminated. Furthermore, because the proposed algo- 
rithm supplies L2 triggers, the L3 handoff delay can be diminished drastically. 

4 Experimental Results and Conclusion 

We developed an experimental platform in order to evaluate the proposed mech- 
anisms in terms of L2 handoff delay. Figure 8 shows our experimental platform 
consisting of an STA, APs, router, and CN. All machines we used are SAM- 
SUNG SENSV25 laptops with Pentium IV 2.4 GHz and 1,024 MB RAM. All 
machines uses RedHat Linux 9.0 as the operating system. To exchange the NG 
information, socket interface is used, and Mobile IPv6 is applied to maintain L3 
connectivity while experimenting. The device driver of a common WNIC was 
modified so that the STA operates as an AP that can support handoff initiation 
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message. For the proposed mechanism, we developed three programs: NG Server, 
NG Client , and SNIFFER. The NG Server manages the data structure of NG 
on the experimental platform and processes the request of the NG Client that 
updates the NG information of the STA after the STA moves to the another 
AP. Using destination address in the MAC frame of the STA, the SNIFFER 
can be aware of the AP associated with the STA. If the AP can support the 
requirement of the STA, the SNIFFER sends the old AP a message including 
lrandoff information such as the eavesdropped QoS parameters, the measured 
RSSI, the available throughput, the MAC address of the STA, and so on. In the 
proposed lrandoff scheme, the lrandoff delay is defined by the duration between 
the first ReassociationRequest message and the last ReassocationResponse mes- 
sage. As shown in Table 1, the average L2 lrandoff delay incurred by the proposed 
method is much shorter than that incurred by the conventional method. 




Fig. 8. Experimental platform 



Table 1 . Average Handoff Delay 



Measuring method 


Average delay [ms] 


Proposed method 


3.9 


Conventional method 


259 



We also simulated the proposed mechanisms using MATLAB to evaluate the 
performance in terms to load balancing based on QoS awareness. At first, we 
construct a test environment for the proposed effective initial association based 
on QoS awareness as follows: 100 APs and 500 STAs are deployed within an 
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area of 1000 x 1000. Each AP has its own throughput and contention period 
randomly assigned. And each STA is serviced by various applications with dif- 
ferent QoS requirement. We let the STAs be associated with the APs using both 
methods, the conventional method and the proposed method. As the number of 
the STAs associated with an AP increases, the available throughput and average 
contention period supported by the AP is degraded. Thus some overloaded APs 
can not support the QoS requirement of the STAs connected to them. As shown 
in Table 2, with the proposed algorithm, the loads offered to APs are distributed 
and the QoS requirements of the STAs are guaranteed when the STAs are turned 
on. 

Table 2. Ratios of the overloaded APs and not fully supported STAs when the STAs 
are turned on initially 





Overloaded APs (%) 


Not fully supported STAs (%) 


Conventional method 


10.3 


9.2 


Proposed method 


0 


0 



Second, in order to evaluate the proposed fast handoff based on QoS aware- 
ness, we make an STA move randomly. Table 3 shows that the proposed algo- 
rithm can significantly reduce the total time when the QoS requirement of the 
moving STA is not guaranteed. 



Table 3. Total time when the QoS requirement of the STA is not guaranteed 





Duration (QoS 


is not guaranteed) (%) 


Conventional method 


27.6 


Proposed method 


1.2 



We measured the L2 handoff delay of the proposed fast handoff algorithm 
on the experimental platform and evaluated the load distribution based on QoS 
awareness of the proposed mechanisms using computer programming. Initially, 
the QoS requirement of the STA can be guaranteed by attaching QoS information 
to Beacon frames when STAs are turned on. If the STA has to perform handoff 
due to its mobility, the STA can change the associated AP to another AP without 
ping-pong phenomenon. It is expected that real-time multimedia service can be 
realized without disconnection by the proposed fast handoff based on NG. 
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Abstract. Ambient intelligent applications require applications to recognise 
user activity calmly in the background, typically by instrumentation of 
environments. In contrast, we propose the concept of Cooperative Artefacts 
(CAs) to instrument single artefacts that cooperate with each other to acquire 
knowledge about their situation in the world. CAs do not rely on external 
infrastructure as they implement their architectural components, i.e. perceptual 
intelligence, domain knowledge and a rule-based inference engine, on 
embedded devices. We describe the design and implementation of the CA 
concept on an embedded systems platform and present a case study that 
demonstrates the potential of the CA approach for activity recognition. In the 
case study we track surface-based activity of users by augmenting a table and 
household goods. 



1 Introduction 

Ambient Intelligence research aims to create environments that support the needs of 
people with ‘calm’ technology. [4, 31]. To this effect many ambient intelligence 
systems make use of knowledge about activities occurring in the physical 
environment to adapt their behaviour. Hence, one of the central research challenges of 
ambient intelligence research is how such systems acquire, maintain, and use models 
of their changing environment. Approaches to address this challenge are generally 
based on instrumentation of devices, physical artefacts or entire environments. 
Specifically, instrumentation of otherwise non-computational artefacts has been 
shown to play an important role, e.g. for tracking of valuable goods [9, 18, 26], 
detecting safety critical situations [28], or supporting human memory [17]. In most 
augmented artefact applications artefacts are instrumented to support their 
identification [18, 21, 26] while perception, reasoning and decision-making is 
allocated in backend infrastructures [1, 6] or user devices [23, 27]. This approach, 
however, makes artefacts reliant on supporting infrastructure, and ties applications to 
instrumented environments. 

In this paper we introduce the notion of Cooperative Artefacts (CAs) that combine 
sensing, perception, reasoning and communication. Such artefacts are able to 
cooperatively identify, track and interpret activities in dynamically changing 
environments without relying on external infrastructure. Cooperative artefacts model 
their situation on the basis of domain knowledge, observation of the world, and 
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sharing of knowledge with other artefacts. Thus, knowledge about the real world 
becomes integral with the artefact itself. 

In section 2 we introduce the notion of Cooperative Artefacts and describe the 
structure and mechanisms by which CAs cooperate to acquire knowledge and reason 
about the world. In section 3 we describe an implementation of the CA concept on an 
embedded systems platform based on a case study to demonstrate the potential of our 
approach for activity recognition applications. In the case study an augmented table 
and household goods are used to track surface-based user activities. In section 4 we 
describe the interaction between involved artefacts. Finally, we discuss related work 
in section 5 and conclude our paper in section 6. 



2 Cooperative Artefacts 

Cooperative Artefacts (CAs) are self-contained physical entities that are able to 
autonomously observe their environment and reason about these observations, thus 
acquiring knowledge about their world. Most notably they cooperate by sharing their 
knowledge which allows them to acquire more knowledge collectively than each of 
them could acquire individually. It is a defining property of our approach that world 
knowledge associated with artefacts is stored and processed within the artefact itself. 
Although this structure is independent of any particular hardware platform, all 
components are intended to be implemented on low-powered embedded devices with 
inherent resource limitations. Figure 1 depicts the architecture of a Cooperative 
Artefact. 



remote 

Knowledge 




local 

knowledge 



Fig. 1 . Architecture of a Cooperative Artefact 

• Sensors. Cooperative Artefacts include sensors which provide measurements of 
phenomena in the physical world. 

• Perception. The perception component associates sensor measurements with 
meaning, producing observations that are meaningful in terms of the application 
domain. 
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• Knowledge base. The knowledge base stores the acquired knowledge of the 
artefact and externalises the artefact knowledge. 

• Inference. The inference component processes the knowledge of an artefact as 
well as knowledge provided by other artefacts to infer further knowledge. 



2.1 Structure of the Artefact Knowledge Base 

The artefact knowledge base is structured into facts and rules. Facts are the 
foundation for any decision-making and action-taking within the artefact, and rules 
allow inferring further knowledge based on facts and other rules (table 1). 



Table 1 . Knowledge managed by a Cooperative Artefact. 



Domain 

Knowledge 


Domain knowledge built into the artefact, e.g. facts 
describing the physical nature of the artefact or general 
world knowledge. 


Observational 

Knowledge 


Knowledge describing the situation of an artefact in the 
world. It is based on facts that result from sensor-based 
observations. 


Inferred 

Knowledge 


Rules are used to infer further knowledge based on 
previously established facts, which may be based on 
domain knowledge, observation, previous inference, or 
knowledge made available by cooperating artefacts. 



2.2 Cooperation Between Artefacts 

Activity recognition applications will rely on rich knowledge about users and their 
environment. It is therefore a desirable feature that artefacts cooperate to maximise 
the knowledge available about the physical world. Our model for cooperation is that 
artefacts share knowledge. More specifically, knowledge stored in an artefact’s 
knowledge base is made available to other artefacts where they feed into the inference 
process generating additional knowledge. Effectively, the artefact knowledge bases 
taken together form a distributed knowledge base on which the inference processes in 
the individual artefacts can operate. 

Although different artefacts may be able to observe the same physical 
phenomenon, the acquired knowledge is likely to be incomplete and different between 
observing artefacts. This is due to the nature and diversity of the used sensors and 
perception algorithms. However, the artefacts can use the distributed knowledge to 
exchange and reason about their knowledge. This leads to a synergetic effect: 
cooperating artefacts are able to acquire more knowledge collectively than each of 
them could acquire individually. This principle is illustrated in figure 2. 
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..cooperate.. 



..cooperate.. 




Fig. 2. Cooperation of artefacts is based on sharing of knowledge 



3 CA Case Study: Tracking Surface-Based User Activities 

Recent research has identified activity centres as areas in domestic settings where 
human activity is focused on [8]. This research suggests that the identified activity 
centres, such as surfaces provided by kitchen tables, should be considered as prime 
sites for technological augmentation. Other research has shown that tagged artefacts 
may reveal valuable information about users’ activity [17]. Based on this research we 
chose to track surface-based user activities as a case study for Cooperative Artefacts. 
The basic application idea is to track artefacts on and across tables to infer the user 
activity. 




Fig. 3. CA demonstrator 

The CA demonstrator (Fig. 3) was developed to show the potential of Cooperative 
Artefacts for activity recognition applications. Glasses and jugs were built as artefacts 
that are aware whether they are on surface or not. They cooperate with a load sensing 
table to infer their location on the table and, as a further synergetic effect of 
cooperation, they can infer their filling state. The picture in Fig. 3 shows the table, 
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glasses and jugs as it was demonstrated at SIGGRAPH [13]. This setup also included 
chairs to measure the weight distribution of people sitting on them. A graphical 
representation of the position of the glasses and the jug on the table was projected on 
a screen in order to display the recognised interactions of visitors with the artefacts. 
This section describes the implementation of the CA architecture on an embedded 
device platform. 

3.1 Artefact Sensors and Perception 

As a first step, the artefacts involved in the demonstrator require some basic form of 
intelligence, i.e. they need sensors and computing power to implement perception. 
Fort this purposes we used the DIY Smart-its platform, an easy to use and highly 
customizable hardware platform for wireless sensing applications [11]. The modular 
design of the DIY Smart-its allows using a range of different sensors by plugging 
add-on boards on a basic processing and communication board. 

We used an add-on board to interface four industrial load cells that were put under 
each leg of the table. With this technology we can easily augment most tables in an 
unobtrusive way. We implemented a perception algorithm on the microcontroller of 
the DIY Smart-it to calculate events based on the weight changes on the table. Thus, 
we are able to recognise when an artefact has been put or removed from the table. In a 
second step, we use the load distribution on the four load cells to calculate the 
position of the artefact on the table surface. Additionally we obtain the weight of the 
artefact that has just been put or removed from the table. Further details about the 
implementation of context acquisition based on load sensing and its potential 
applications have been published in [24] and [25]. 



Fig. 4. Left: A mini Smart-its with battery attached to a water jug. The FSR sensor is mounted 
on the bottom of the jug. The glass was augmented in a similar way. Right: A DIY Smart-it 
with load add-on board attached to the frame of the table. The black cables connect to the four 
load cells. 

A smaller version of the DIY Smart-its has been used to augment jugs and glasses 
with force sensitive resistance sensors (FSR sensors). Thus the perception algorithm 
of glasses and jugs can observe if the artefact is put down on a surface or lifted up. 
Figure 4 shows the physical augmentation of the artefacts. 

Thus, each artefact is able to make individual observations about its world: 

• The load table observes the position and weight of artefacts on its top but does 
not know about their identity, i.e. is it a jug, glass or something else. 

• Jugs and glasses observe whether they have been put on a surface or lifted up, 
but they know nothing about their location. 
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3.2 Artefact Knowledge Bases 

These observations are stored and managed in the knowledge bases embedded in the 
individual artefacts. They contain facts to represent the knowledge and rules to infer 
further knowledge. Each of the artefacts implements an inference engine similar to a 
simple Prolog interpreter that operates on these facts and rules expressed in a subset 
of Horn logic [14]. The inference engine uses backward-chaining with depth first 
search as inference algorithm. Compromises in terms of expressiveness and generality 
were necessary to facilitate the implementation on a micro-controller platform. The 
data structures for the predicate arguments provide information whether the argument 
refers to an external artefact which allows the inference engine to acquire knowledge 
from other artefacts using a query/reply protocol. 



Table 2. Knowledge base of a load table 



Domain 

Knowledge 


concurrent (<time>, <time>) 


Observational 

Knowledge 


location_and_weight (me, x, y, _, w, <added/removed> , 
<time>) 


Rules | 


(Rl) 


location_and_weight (me , X, Y, A, W, added, T2):- 
location_and_weight (me, X, Y, W, added, Tl) , 

location_and_weight (_, _, _, A, added, T2), 

concurrent (Tl , T2 ) . 



Table 3. Knowledge base of a jug/glass 



Domain Knowledge 


concurrent (<time>, <time>) 
above_weight_threshold ( <weight> ) 
below_weight_threshold ( <weight> ) 


Observational 

Knowledge 


location_and_weight (_, _, _, me, _, < added /removed>, 
<time>) 


Rules | 


(R2) 


f illing_state (me, full, T2):- 

location_and_weight (TABLE, X, Y, _, W, EVENT 1 , Tl) , 
location_and_weight (_, _, _, me, _, EVENT2 , T2), 
concurrent (Tl , T2), above_weight_threshold (W) , EVENTl == 
EVENT2 . 


(R3) 


f illing_state (me, empty, T2):- 

location_and_weight (TABLE, X, Y, _, W, EVENTl, Tl) , 
location_and_weight (_, _, _, me, _, EVENT2 , T2), 
concurrent (Tl , T2), below_weight_threshold (W) , EVENTl == 
EVENT2 . 



Rules and some facts are specified by the developer. Other facts such as 
location_and_weight (<table>, <x>, <y>, <artefact>, <weight>, 

<added/removed>, <time>) model the observational knowledge acquired by the 
artefacts. This observation indicates the position and weight on a table on which an 
artefact has been added or removed at a certain time. Both kinds of artefacts, the load 
table and glasses/jugs, model their observation with the same fact; however with 
different levels of information according to their perception capabilities as described 
in the previous subsection. Table 2 lists the knowledge base of an load table while 
Table 3 lists the knowledge base of an jug or glass. Lowercase letters are constants 
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and model a specific value of an argument, while we use the special character to 
indicate an unknown or irrelevant value of an argument. Uppercase letters indicate 
variables. The special constant me always refers to the artefact that stores the fact or 
rule. Arguments put in brackets are to be replaced with the concrete values for each 
observation. Rule R1 can be verbalised as follows: 

Rl: A table knows location, weight and identity of an artefact on its top if both, the 
table and the artefact, observe the event of putting the artefact on the table at 
approximately the same time. 

This rule relies on synchronised time between artefacts as each observation is time- 
stamped to determine time relationships between two observations, 
concurrent (<time>, <time>) is semantically equivalent to the expression | Tl- 
T2 1 <time_th with an appropriate threshold time_th. The table perception 
algorithm for the location_and_weight observation takes a few milliseconds 
longer than the algorithms of the glasses and jugs. Consequently we take T2, the time 
of the glass (or jug respectively) observation as a timestamp for the inferred location 
and weight. Rules R2 and R3 can be verbalised as follows: 

R2 and R3: A glass or jug knows its filling state when it obseiyes the same event at 
approximately the same time as the table. 

Rules R2 and R3 are a by-product of our initial goal to obtain location information 
about artefacts that are put on the table. This is due to the fact that we measured the 
weight distribution to calculate the position of artefacts. These rules also make use of 
additional domain knowledge: above_weight_threshold(<weight>) and 

below_weight_threshold (<weight>) model for each individual artefact a 
weight threshold to determine if a glass or jug is full or empty. 



4 Tracking User Activities with the CA Demonstrator 

In this section we describe a set of actions as they have occurred during the demo 
setup. We use an initially empty table with which users interact by putting and 
removing a conventional bottle and the augmented glass on and from the table. The 
glass is initially empty and located on a conventional table. We perform the following 
actions: 

Action 1. Put a conventional bottle in the middle of the table 
Action 2. Put an empty, augmented glass on top left corner of the table 
Action 3. Remove glass from table and fill it with water 
Action 4. Put the glass back on the bottom right corner of the table 

Initially the artefact knowledge bases only contain their respective domain knowledge 
entries (cf. Table 2 and Table 3). The knowledge base of the glass also reflects that it 
is put on the conventional table. After putting the conventional bottle on the table the 
perception component of the table adds the corresponding observation to the 
knowledge base and we obtain the observations as detailed in Table 4. 
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Table 4. Observations after action 1 



Glass 


Table 


location_and_weight (_, _, 
me, added, 30) 


location_and_weight (me, 25, 
25, _, 400, added, 768) 



In order to detect the changes in the environment and update the screen display, a 
query for location_and_weight must be sent to the table. The query contains only 
one value, the table identifier to which the query should be sent. All other arguments 
contain variables representing the values we are interested in: 

location_and_weight ( table , X, Y, A, W, EVENT, TIME) 

When the table receives this message it tries to unify the message with the entries in 
the knowledge base. The inference engine always finds the most specific answer and 
tries to evaluate rule Rl. It checks the premises of the rule and unifies the table 
observation with the entry in the fact base. The external observation 
location_and_weight (_, A, added, T2 ) requires cooperation 

with an unknown artefact A. Therefore, the inference engine issues a broadcast query 
for this observation. The glass replies with its observation as detailed in table 3. 
However, rule Rl cannot be applied as the timestamps are not close in time. 
Therefore, the table replies with 

location_and_weight (me, 25, 25, 400, added, 768) 

not being able to provide information about the actual identity of the bottle. The 
visualization projected on the screen is now updated showing a question mark for the 
bottle. 

After the second action, both the table and the glass add new observations to their 
knowledge bases as shown in Table 5. 



Table 5. Observations after action 2 



Glass 


Table 




location_and_weight (me , 25, 

25, _, 400, added, 768) 


location_and_weight (_, 
me, _, added, 2103) 


location_and_weight (me, 25, 
25, _, 131, added, 2105) 



Note, that the perception component of the glass always updates the knowledge base 
to reflect the latest observed state, i.e. there was an intermediate observation when the 
glass was lifted from the conventional table that is not displayed in the table. Queries 
to the table will now result in replies that provide information about the location of 
the glass: 

location_and_weight (table, 25, 25, glass, 400, added, 2105) 

This is a result from applying rule R2 which entails a broadcast query. This query is 
replied by the glass with the corresponding observation that was made at nearly the 
same time. The identity of the glass can now be used to query the glass about its 
filling state: 

filling_state (glass, FILLING_STATE, T2 ) 
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In order to answer this query the glass must cooperate with the table: rules R2 and R3 
rely on the table’s observations. Again, the table issues a broadcast query asking for 
the observation made by the table. The table replies with his observations and the 
glass replies with the conclusion of rule R3: 

f illing_state (glass , empty, 2103) 

Again the changes in the environment have been detected and the display can be 
updated: 




Fig. 5. Screenshot of the available knowledge after action 2. In the demonstrator question 
marks are used to represents unknown artefacts, their size is relative to their weight. Here the 
question mark represents the bottle. 

After the third action the knowledge bases of both artefacts are updated. In this state 
queries to the table always return the location of the unknown artefact (the bottle) as 
the observation stored in the glass relates to an earlier observation (cf. Table 6). 



Table 6. Observations after action 3 



Glass 


Table 


location_and_weight (_, 
me, _, removed, 2876) 


location_and_weight (me, 25, 
25, _, 400, added, 1325) 



As we fill the glass with water no new observations are added to either of the 
knowledge bases. After action 4, cooperation between artefacts is similar as after 
action 2 and we obtain the observations in Table 7: 



Table 7. Observations after action 4 



Glass 


Table 




location_and_weight (me , 25, 

25, _, 400, added, 1325) 


location_and_weight (_, _, 
_, me, _, added, 3469) 


location_and_weight (me , 120, 

60, _, 400, added, 3472) 



5 Related Work 

Our work is generally related to other ubiquitous computing research concerned with 
instrumentation of the world and with systems that adapt and react to their 
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dynamically changing environment [10, 22, 1], In contrast to our work, most of these 
previously reported systems and infrastructures are based on instrumentation of 
locations (e.g. office [1, 7, 20], home [6, 15, 29], or of users and their mobile devices 
[19, 23,27]). 

Wireless sensor networks are also generally related to our work as they use similar 
technology, i.e. wireless nodes with generic sensors [2]. However, they differ mainly 
in one point: functionalities implemented on wireless sensor nodes only include data 
acquisition and routing, i.e. taking measurements from sensors and sending the data to 
a backend infrastructure where it is processed and potential models of individual 
nodes are maintained. 

Previous research has also considered the role of artefacts in addition to locations 
and users, e.g. the Cooltown project provides a digital presence for people, places and 
things [16], and SPECs is a proximity sensing hardware platform for activity 
recognition [17]. Closer to our work are systems directly concerned with artefacts and 
their situations, e.g. for tracking of moveable assets [18, 26]. Particulary close in spirit 
is the eSeal system in which artefacts are instrumented with embedded sensing and 
perception autonomously monitor their physical integrity [9]. 

Several levels of integration of artefacts in ubiquitous computing systems have 
been explored, e.g. visual tags [21] and RFID tags [18, 26] to support unique 
identification. SPECs have been attached to artefacts to capture movement 
information of users [17], Collective behaviour of augmented artefacts has been 
explored in the context of the Smart-its project, e.g. by integrating different kind of 
sensors in user’s belongings [12], furniture [3], and cups [5]. A more generic 
framework, based on event-condition-action rules (ECA rules), has been provided by 
the Ubiquitous Chip platform [30]. 



6 Conclusion 

In this paper we demonstrated the potential of the CA concept for activity recognition 
applications. We have described the design and implementation on an embedded 
systems platform in the context of a case study in which an augmented table and 
household goods can be used to track surface-based user activities. There are three 
innovative aspects to be noted: 

• It is a novel approach to acquire and maintain knowledge on activity and 
changes in the world, distinct in being entirely embedded in moveable artefacts. 

• Embedding of generic reasoning capabilities constitutes a new quality of 
embedded intelligence not previously demonstrated for otherwise non- 
computational artefacts. 

• This approach has the potential to leverage activity recognition applications by 
providing rich knowledge about situations in the world that can be especially 
useful for deployment in users homes where installing external infrastructure 
might be critical. 

Currently we are working on a software framework for embedded devices called 
arteFACT that fully implements the CA architecture. Among power efficiency and 
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responsive, our current prototype has also revealed the following two issues that we 
seek to address in our future work: 

• Activity recognition applications rely on timestamped data and history 
information. While we are currently extending our devices with FRAM and 
Flash memory to store history information, it will be crucial to include time 
as a fundamental concept. This will be especially important for querying 
history information. We are planning to look into possibilities of using 
temporal logic in our current implementation of the embedded inference 
engine. 

• In order to improve the scalability of our architecture, we plan to include 
subscriptions to changes in the artefact knowledge bases. This will entail to 
include forward reasoning as a more effective inference algorithm 
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Abstract. We discuss the issue of privacy protection in collaborative filtering, 
focusing on the commonly-used memory-based approach. We show that the two 
main steps in collaborative filtering, being the determination of similarities and 
the prediction of ratings, can be performed on encrypted profiles, thereby securing 
the users’ private data. We list a number of variants of the similarity measures and 
prediction formulas described in literature, and show for each of them how they 
can be computed using encrypted data only. Although we consider collaborative 
filtering in this paper, the techniques of comparing profiles using encrypted data 
only is much wider applicable. 



1 Introduction 

Due to the increased availability of digital content and the explosive growth of the 
Internet, people are confronted with an overload of information. For instance, the amount 
of music available through the Internet is by far too large for a user to cope with. One 
of the approaches to help people to make good selections of content is given by the 
use of recommender systems. These systems estimate to what extent a user would like 
the available content, based on the user’s likes and dislikes for previously encountered 
content. A well-known technique to do so is collaborative filtering [7,13], also known as 
social filtering, which uses preferences of a community of users to predict the preference 
of a particular user for a particular piece of content. Collaborative filtering systems are 
found, for example, on the Internet at music sites to recommend new music, and at book 
sites to recommend new books. 

We can distinguish two global types of collaborative filtering approaches: memory- 
based [7] and model-based [4], Memory-based collaborative filtering is the most com- 
monly used approach. In this approach, which is a lazy learning approach in machine 
learning terms, the preferences (in the form of ratings for content) of a community of 
users are collected at a web server. Then, a similarity measure is computed between 
each pair of users based on the content they jointly rated. Next, recommendations for a 
particular user can be made by considering users that are similar to him, and checking 
for content that they liked but that has not yet been rated by the user or that is not yet in 
the user’s collection. 

Model-based approaches pursue a more active learning strategy. First, the collected 
preference data is processed to build a model of the users’ profile space. For instance, 
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Canny [4] describes a factor-analysis approach, which first distills a basis of user pref- 
erence profiles and expresses the individual users’ profiles in terms of this basis. Next, 
this model is used to make predictions. 

One of the issues with collaborative filtering is that it requires the preferences of a 
community of users for doing the computations. Generally, this information is collected 
on a server, and web services applying collaborative filtering are usually accompanied 
by a privacy statement. However, if the preference information is really personal (e.g. in 
case of medical information), users may not be willing to give this information, because 
of a lack of trust in the server. Furthermore, there may be other reasons why users do 
not want to reveal preference information [3], 

Therefore, we want to develop a system that prevents any information about a user’s 
preferences to become known to others. This not only means that we want to keep the 
user’s ratings for items secret, but even the information of what items he has rated. 
Furthermore, we do not even want to reveal this kind of information anonymously, as 
we do not want to run the risk that the identity of the user is traced back somehow, after 
which his data is in the clear. Finally, as similarities between users also give information 
about a user’s preferences, we also want to keep this data secret. 

In addition to the above requirements from a user’s perspective, we add the require- 
ment that the server should maintain some control over the service, i.e., it should not be 
possible for a user to trivially retrieve valuable gathered data to set up a recommendation 
service too. 

Whereas Canny [4] focuses on model-based collaborative filtering, we discuss in this 
paper how the more commonly used memory-based collaborative filtering technique can 
be performed on encrypted data. This holds for all variants of similarity measures and 
prediction formulas that we describe. 

The remainder of this paper is organized as follows. First, in Section 2 we discuss the 
procedures and formulas behind memory-based collaborative filtering. Next, in Section 3 
we briefly describe the proposed encryption system and its beneficial properties. Then, 
we discuss how the above requirements can be met, by describing how to perform the 
collaborative filtering computations on encrypted data in Section 4. 

We focus in this paper on encryption of preference information in collaborative 
filtering, but the techniques we present are applicable in a much broader context, as 
many more ambient intelligence applications will use some form of matching profiles. 
Also these applications may be much better accepted by users if private information can 
be protected. We will however not elaborate on this. 

2 Memory-Based Collaborative Filtering 

Most memory-based collaborative filtering approaches work by first determining sim- 
ilarities between users, by comparing their jointly rated items. Next, these similarities 
are used to predict the rating of a user for a particular item, by interpolating between the 
ratings of the other users for this item. Typically, all computations are performed by the 
server, upon a user request for a recommendation. 

Next to the above approach, which is called a user-based approach and which is 
the most widely used form of collaborative filtering [2,7,11], one can also follow an 
item-based approach [8,12], Then, first similarities are determined between items, by 
comparing the ratings they have been given by the various users, and next the rating of 
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a user for an item is predicted by interpolating between the ratings that this user has 
given for the other items. In the remainder of this paper, we focus only on the user-based 
approach; the item-based approach can be handled in a similar way [14]. 

Before discussing the formulas underlying memory-based collaborative filtering, we 
first introduce some notation. We assume a set U of users and a set I of items. Whether 
a user uGU has rated item i £ I is indicated by a boolean variable b u i which equals one 
if the user has done so and zero otherwise. In the former case, also a rating r,„; is given, 
e.g. on a scale from 1 to 5. The set of users that have rated an item i is denoted by U,, 
and the set of items rated by a user u is denoted by /,, . 

2.1 Similarity Measures 

The first main step of memory-based collaborative filtering is the determination of simi- 
larities. In this section, we discuss commonly used formulas, of which we show later that 
they all can be computed on encrypted data. Quite a number of similarity measures have 
been presented in the literature before. We distinguish three kinds: correlation measures, 
distance measures, and counting measures. 



Correlation measures. A common similarity measure used in literature is the so-called 
Pearson correlation coefficient (see e.g. [11]), given by 



s(u,v) 



E 






(r u i-r u )(r vi -r v ) 



E 






, ( r ui - r u ) 2 E*e.r„n/„ ( r ™ ~ r v ) 2 



(1) 



where r u denotes the average rating of user u for the items he has rated. The numerator 
in this equation gets a positive contribution for each item that is either rated above 
average by both users u and v, or rated below average by both. If one user has rated an 
item above average and the other user below average, we get a negative contribution. 
The denominator in the equation normalizes the similarity, to fall in the interval [—1,1], 
where a value 1 indicates complete correspondence and — 1 indicates completely opposite 
tastes. 

Related similarity measures are obtained by replacing f u in (1) by the middle rating 
(e.g. 3 if using a scale from 1 to 5) or by zero. In the latter case, the measure is called 
vector similarity or cosine, and if all ratings are non-negative, the resulting similarity 
value will then lie between 0 and 1 . 



Distance measures. Another type of measures is given by distances between two users’ 
ratings, such as the mean-square difference [13] given by 

Eigz„nA, ( Tw ~ r v ?; ) 2 

\l u nl v \ 

or the normalized Manhattan distance [1] given by 

EieJ„nJ„ l r ™ ~ Vvi \ 
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Such a distance is zero if the users rated their overlapping items identically, and larger 
otherwise. A simple transformation converts a distance into a measure that is high if 
users’ ratings are similar and low otherwise. 



Counting measures. Counting measures are based on counting the number of items that 
two users rated (nearly) identically. A simple counting measure is the majority voting 
measure [9] of the form 

s(u,v) = ( 2 -j) Cuv j duv , (4) 

where 7 is chosen between 0 and 1, c uv = |{t £ I u C\I V | r U i ss r V i}\ gives the number 
of items rated ‘the same’ by u and v, and d uv = \I U fl I v \ — c uv gives the number of 
items rated ‘differently’. The relation « may here be defined as exact equality, but also 
nearly-matching ratings may be considered sufficiently equal. 

Another counting measure is given by the weighted kappa statistic [5], which is 
defined as the ratio between the observed agreement between two users and the maximum 
possible agreement, where both are corrected for agreement by chance. More formally, 
the measure is given by 



f &uv 

Here, o uv is the observed fraction of agreement, given by 



Ouv 



T,iGl u nI v W ( r ui’ r vi) 

\IuM v \ 



where weights w(x,y), with 0 < w(x,y) = w(y,x) < 1 and w(x,x) = 1 , indicate the 
degree of correspondence between ratings x and y. The offset e uv is the expected fraction 
of agreement, and is given by 



tv EE Pu{x)p v {y)w(x,y), 
xex ye x 



where X is the set of possible ratings, and p u {x ) is the fraction of items that u has given 
a rating x, i.e.. 
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|{* £ Iu | X u i — x}| 

Ei 



2.2 Prediction Formulas 

The second step in collaborative filtering is to use the similarities to compute a prediction 
for a certain user-item pair. Also for this step several variants exist. For all formulas, we 
assume that there are users that have rated the given item; otherwise no prediction can 
be made. 



Weighted sums. The first prediction formula, as used in [7], is given by 

T,ve Ui s ( u ’ v )( r vi-rv) 



r U i = r u 



E 



vGUi 



s(u,u)| 



(5) 
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So, the prediction is the average rating of user u plus a weighted sum of deviations from 
the averages. In this sum, all users are considered that have rated item i. Alternatively, 
one may restrict them to users that also have a sufficiently high similarity to user u, i.e., 
we sum over all users in Ui(t) = {v £ Ui \ s(u,v) > t} for some threshold t. 

An alternative, somewhat simpler prediction formula is given by 

E v&Ji S ( U ’ V ) r vi 

r ui — | / \i • (6) 

E ue t/J s ( u ^)l 

Note that if all ratings are positive, then this formula only makes sense if all similarity 
values are non-negative, which may be realized by choosing a non-negative threshold. 



Maximum total similarity. A second type of prediction formula is given by choosing 
the rating that maximizes a kind of total similarity, as is done in the majority voting 
approach, given by 

f U i = avg max xeX ^ s(u,v), (7) 

veu f 

where Uf = {v £ Ui \ r V i « x} is the set of users that gave item i a rating similar to 
value x. Again, the relation ss may be defined as exact equality, but also nearly-matching 
ratings may be allowed. Also in this formula one may use Ui(t) instead of Ui to restrict 
oneself to sufficiently similar users. 



3 Encryption 

In the next section we show how the presented formulas for collaborative filtering can be 
computed on encrypted ratings. Before doing so, we present the encryption system we 
use, and the specific properties it possesses that allow for the computation on encrypted 
data. 



3.1 A Public-Key Cryptosystem 

The cryptosystem we use is the public-key cryptosystem presented by Paillier [10]. We 
will not describe it in full detail, for which we refer to the paper, but we briefly describe 
how data is encrypted. 

First, encryption keys are generated. To this end, two large primes p and q are chosen 
randomly, and we compute n=pq and A = lcm(p —1,5 — 1). Furthermore, a generator 
g is computed from p and q (for details, see [10]). Now, the pair (■ n,g ) forms the public 
key of the cryptosystem, which is sent to everyone, and A forms the private key, to be 
used for decryption, which is kept secret. 

Next, a sender who wants to send a message m £ h n = {0, 1, . . . , n — 1} to a receiver 
with public key (n,g) computes a ciphertext e(m) by 

e(m) = g m r n mod n 2 , (8) 

where r is a number randomly drawn from Z* = {x £ Z | 0 < x < n A gcd(a:, n) — 1}. 
This r prevents decryption by simply encrypting all possible values of m (in case it can 
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only assume a few values) and comparing the end result. The Paillier system is hence 
called a randomized encryption system. 

Decryption of a ciphertext c = e(m) is done by computing 



L(c x mod n 2 ) 
L(g x mod n 2 ) 



mod n, 



where L(x) = (x — 1 )/n for any 0 < x < n 2 with x = 1 (mod n). During decryption, 
the random number r cancels out. 

Note that in the above cryptosystem the messages m are integers. Nevertheless, 
rational values are possible by multiplying them by a sufficiently large number and 
rounding off [6]. For instance, if we want to use messages with two decimals, we simply 
multiply them by 100 and round off. Usually, the range Z„ is large enough to allow for 
this multiplication. 



3.2 Properties 

The above presented encryption scheme has the following nice properties. The first one 
is that 

£ (mi) £ (m 2 ) = = g^ mi+rn 2 \rir2) n = s(mi +1x12) (mod n 2 ), 

which allows us to compute sums on encrypted data. Secondly, 

£ (TOi ) m2 EE (g mi r ?) m2 = g m i ra2 (r” 2 ) n = e(mim 2 ) (mod n 2 ), 

which allows us to compute products on encrypted data. An encryption scheme with 
these two properties is called a homomorphic encryption scheme. The Paillier system is 
one homomorphic encryption scheme, but more ones exist. 

We can use the above properties to calculate sums of products, as required for the 
similarty measures and predictions, using 

n £ ( aj ) bj = Y[ e ( a : b j) = z(52 a i b A ( mod n2 )' (9) 

j 3 3 

So, using this, two users a and b can compute an inner product between a vector of each 
of them in the following way. User a first encrypts his entries a,j and sends them to b. 
User b then computes (9), as given by the left-hand term, and sends the result back to a. 
User a next decrypts the result to get the desired inner product. Note that neither user a 
nor user b can observe the data of the other user; the only thing user a gets to know is 
the inner product. 

A final property we want to mention is that 

e(mi)e(0) = g mx r^r^ = <? mi (Dr 2 ) n = e(m 1 ) (mod n 2 ). 

This action, which is called (re)blinding, can be used also to avoid a trial-and-error 
attack as discussed above, by means of the random number r 2 € Z*. We will use this in 
Section 4.2. 
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4 Encrypted Collaborative Filtering 

Having all ingredients in place, we now explain how memory-based collaborative filter- 
ing can be performed on encrypted data, in order to compute a prediction r for a certain 
user u and item i. Note that although the computations are done on encrypted data, the 
outcome is of course identical to that of the original collaborative filtering algorithm. 

We consider a setup as depicted in Figure 1, where user u communicates with other 
users v through a server. Furthermore, each user has generated his own key, and has 
published the public part of it. As we want to compute a prediction for user u, the steps 
below will use the keys of u. 




Fig. 1. The setup for the user-based algorithm. 



4.1 Computing Similarities on Encrypted Data 

First we take the similarity computation step, for which we start with the Pearson corre- 
lation given in (1). Although we already explained in Section 3 how to compute an inner 
product on encrypted data, we have to resolve the problem that the iterator i in the sums 
in (1) only runs over I u nl v , and this intersection is not known to either user. Therefore, 
we first introduce 

_ / T U i — t u if b U i = 1, i.e,, user u rated item i. 

^ U1 0 otherwise, 

and rewrite ( 1 ) into 



s(u,v ) = 



Ejg/ QuiQvi 



Qui^vi 'Y2i£i<lvibui 



The idea that we used is that any i /,, n does not contribute to any of the three sums 
because at least one of the factors in the corresponding term will be zero. Hence, we 
have rewritten the similarity into a form consisting of three inner products, each between 
a vector of u and one of v. 

The protocol now runs as follows. First, user u calculates encrypted entries e(q U i), 
e{q^ li ), ande(b u i) for all* £ I, using(8),and sends them to the server. The server forwards 
these encrypted entries to each other user vi , . . . ,v m . Next, each user Vj, j = 1, . . . ,m, 
computes e(J2ie i Qmqvji), e(Eig/ vlAji), and e(E ie j QvpKi), using (9), and sends 
these three results back to the server, which forwards them to user u. User u can decrypt 
the total of 3 m results and compute the similarities s(u,Vj), for all j = 1, . . . ,m. Note 
that user u now knows similarity values with the other m users, but he need not know 





68 



W.F.J. Verhaegh et al. 



who each user j = 1 is. The server, on the other hand, knows who each user 

j = 1, . . . ,m is, but it does not know the similarity values. 

For the other similarity measures, we can also derive computation schemes using 
encrypted data only. For the mean-square distance, we can rewrite (2) into 

Eig/„nJ„ ( r ui ~~ uifyi + r vi ) _ Eiel r u jbyi + ^Ei^I r ui{^ r vi) + EzgJ r yjbui 

I -A/ F) | t bnib„i 

where we additionally define r U i = 0 if b U i = 0 in order to have well-defined values. So, 
this distance measure can also be computed by means of four inner products. 

The computation of normalized Manhattan distances is somewhat more complicated. 
Given the set X of possible ratings, we first define for each x £ X, 

ix f 1 if b u i — 1 A V u i — X, 

ul y 0 otherwise, 



and 

x / | T ui x\ if b u i — 1 , 

aul y 0 otherwise. 

Now, (3) can be rewritten into 



Eig/E 



xGX ' 



E 



E ie ^ uibv 



V h x n x . 

xGX u ut Uj vi 

buibyi 



So, the normalized Manhattan distance can be computed from \X\ + 1 inner prod- 
ucts. Furthermore, for the numerator a user v can compute Ylxex e (52iei ^ui a vi) = 
e(ExexE .eiKiKi), and send this result, together with the encrypted denominator, 
back to user u. 

The majority- voting measure can also be computed in the above way, by defining 



f 1 if b u i — 1 A v ui x, 

t 0 otherwise. 



GO) 



Then, c uv used in (4) is given by 






u uv / , / l ' y ui' AJ vi'> 

xex iei 



which can again be computed in a way as described above. Furthermore, 

duv — ^ ' b u i b v i Cuv • 
iei 

Finally, we consider the weighted kappa measure. Again, o uv can be computed by defin- 
ing 

w(x,r ui ) if b ui = 1, 

0 otherwise, 



a",- = 



and then calculating 



Ex€xE,e iK< 

Eigi& uibyi 
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Furthermore, e uv can be computed in an encrypted way if user u encrypts p u {x) for all 
x £ X and sends them to each other user v, who can then compute 

n n <p»(*)r (yMi,v) 

x&X y£Y 

and send this back to u for decryption. 

4.2 Computing Predictions on Encrypted Data 

For the second step of collaborative filtering, user u can calculate a prediction for item 
i in the following way. First, we rewrite the quotient in (5) into 

Y.vzu s(u,v)<lvi 
T,veu\ s ( u ’ v )\ b vi' 

So, first user u encrypts s(it, Vj) and |s(it, Vj) | for each other user Vj, j = 1, . . . , m, and 
sends them to the server. The server then forwards each pair £(s(u,Uj)),£(|s(w,u,-)|) 
to the respective user Vj , who computes s:(s(u,Vj)) qv i' l e( 0) = e(s(u,Vj)q Vj i ) and 

£(|s(w,u,')|) h ' 1 b' l £(0) = e(\s(u,Vj)\b Vj i), where he uses reblinding to prevent the server 
from getting knowledge from the data going back and forth to user Vj by trying a few 
possible values. Each user Vj next sends the results back to the server, which then com- 
putes 

m m 

Y[£{s(u,Vj)q Vji ) = e(^2s(u,Vj)q Vji ) 

3 = 1 3 = 1 

and 

m m 

J^£(|s(u,U i )|6. U3 . i ) = £(^|s(u,u i )|&^. i ), 

3 = 1 3 = 1 

and sends these two results back to user u. User u can then decrypt these messages and 
use them to compute the prediction. The simple prediction formula of (6) can be handled 
in a similar way. 

The maximum total similarity prediction as given by (7) can be handled as follows. 
First, we rewrite 

m 

='52 s (u,v j )a%. i , 
veuf j = i 

where a®,, is as defined by (10). Next, user u encrypts s(u,Vj) for each other user Vj, 
j = l, ... ,m, and sends them to the server. The server then forwards each s(s(u,Vj )) 

CL X 

to the respective user Vj , who computes e(s(u,Vj )) v i l £(0) = s(s(u,Vj)a^ i ), for each 
rating x £ X, using reblinding. Next, each user Vj sends these |X| results back to the 
server, which then computes 

m m 

£(«(«, Uj)a*.i) = 

i= i j = i 

for each x £ X, and sends the |X| results to user u. Finally, user u decrypts these results 
and determines the rating x that has the highest result. 
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5 Conclusion 

We have shown how collaborative filtering can be done on encrypted data. In this way, 
sensitive information about a user’s preferences, as discussed in the introduction, is kept 
secret, as it only leaves the user’s system in an encrypted form. We have listed a number 
of variants of the similarity measures and prediction formulas described in literature, and 
showed for each of them how they can be computed using encrypted data only, without 
affecting their results. 

Compared to the original set-up of collaborative filtering, the new set-up requires 
a more active role of the users’ devices. This means that instead of a (single) server 
that runs an algorithm, we now have a system running a distributed algorithm, where 
all the nodes are actively involved in parts of the algorithm. The time complexity of 
the algorithm basically stays the same, except for an additional factor |X| for some 
similarity measures and prediction formulas, and the fact that the new set-up allows for 
parallel computations. 

Although we showed that collaborative filtering can in principle be done on encrypted 
data, there are a few more issues to be resolved for a practical implementation. For 
instance, one should take into account the computational and communication overhead 
required due to the encryption and decryption of data. Furthermore, the system should 
be made robust against more complex forms of attacks, e.g., an attack where a user 
repeatedly computes similarities to other users, each time using a profile with only one 
item. These issues are topic of further research. 

Although we only discussed collaborative filtering, the technique of computing sim- 
ilarities between profiles on encrypted data only is interesting for other applications as 
well, such as user matching and service discovery. In the vision of ambient intelligence, 
where much more (sensitive) profiling will be used in the future, this may play a crucial 
role in getting these applications accepted by a wide audience. This is also topic of 
further research. 
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Abstract. Service discovery is a process of locating, or discovering, one or 
more documents, that describe a particular service. Most of the current service 
discovery approaches perform syntactic matching, that is, they retrieve services 
descriptions that contain particular keywords from the user's query. This often 
leads to poor discovery results, because the keywords in the query can be se- 
mantically similar but syntactically different, or syntactically similar but se- 
mantically different from the terms in a service description. Another drawback 
of the existing service discovery mechanisms is that the query-service matching 
score is calculated taking into account only the keywords from the user’s query 
and the terms in the service descriptions. Thus, regardless of the context of the 
service user and the context of the services providers, the same list of results is 
returned in response to a particular query. This paper presents a novel approach 
for service discovery that uses ontologies to capture the semantics of the user’ s 
query, of the services and of the contextual information that is considered rele- 
vant in the matching process. 



1 Introduction 

Ambient intelligence aims at enriching users' lives by providing ubiquitous, transpar- 
ent and intelligent electronic services [1]. These services are diverse and distributed in 
the user's environment. 

A key feature of ambient intelligence is transparency on service provisioning. The 
process of discovering and invoking relevant services should be hidden from the 
users' point of view. In order to realize this scenario, we need mechanisms to provide 
smart service discovery based on the current situation of the user (e.g., user's location, 
his interest, user's environment characteristics, etc). We define the user's current 
situation as context [9]. Contextual information of the user is therefore an essential 
aspect to accomplish transparency in the service discovery process within the ambient 
intelligence scenario. 

Most of the existing service discovery mechanisms retrieve services descriptions 
that contain particular keywords from the user’s query. In the majority of the cases 
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this leads to low recall 1 and low precision 2 of the retrieved results. The reason for the 
first is that query keywords might be semantically similar but syntactically different 
from the terms in service descriptions, e.g. ‘buy’ and ‘purchase’ (synonyms). The 
reason for the second is that the query keywords might be syntactically equivalent but 
semantically different from the terms in the service description, e.g. ‘order’ in the 
sense of proper arrangement and ‘order’ in the sense of a commercial document used 
to request supply of something (homonyms). Another problem with keyword-based 
service discovery approaches is that they cannot completely capture the semantics of 
user’s query because they do not consider the relations between the keywords. One 
possible solution for this problem is to use ontology-based retrieval. In this approach, 
ontologies are used for classification of the services based on their properties. This 
enables retrieval based on service types rather than keywords. 

Another drawback of the existing service discovery approaches is that the query- 
service matching score is calculated taking into account only the keywords from the 
user’s query and the terms in the service descriptions. Thus, regardless of the context 
of the user and the context of the service providers, the same list of results is returned 
in response to a query. By definition, context is a situation of an entity (person, place 
or object) that is relevant to the interaction between a user and an application [9]. 
Therefore, considering the context in the query-service matching process can improve 
the quality of the retrieved results. However, contextual information is highly inter- 
related and has many alternative representations [27] that makes it difficult to inter- 
pret and use. One possible solution is again to use ontologies to specify the interrela- 
tions among context entities and ensure common, unambiguous representation of 
these entities. 

This paper presents a novel approach for service discovery that uses ontologies to 
capture the semantics of the user’s query, of the services and of the contextual infor- 
mation that is considered relevant in the matching process. The paper is based on a 
master thesis [6] that can be used as further reading. 

The paper is structured as follows: section 2 presents the existing service discovery 
approaches and their major drawbacks. Section 3 presents our service discovery ap- 
proach. Section 4 discusses the implementation and evaluation of the proposed ap- 
proach and section 5 summarizes our contributions. 



2 Existing Service Discovery Approaches 

2.1 Traditional Service Discovery 

CORBA [24] proposed one of the first service discovery approaches. It specifies 
naming [23] and trading services [22] used to discover objects on a network. The 



1 recall - a standard measure of information retrieval performance, defined as the number of 
relevant items retrieved divided by the total number of relevant items in the collection. The 
highest value of recall is achieved when all relevant items are retrieved 

2 precision - a standard measure of information retrieval performance, defined as the number 
of relevant items retrieved divided by the total number of items retrieved. The highest value 
of precision is achieved when only relevant items are retrieved 
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naming service is keyword-based whereas the trading service supports discovery 
based on the service types. UDDI [29] is the most used service discovery approach 
for web services [3]. The core of the UDDI architecture is a central business registry 
that functions as a naming and directory service. Services in this registry are de- 
scribed from three different perspectives, comparable to the white, yellow and green 
pages of the telephone dictionary. Furthermore, service descriptions consist of tMod- 
els that classify the business or web service using standard or user-defined taxono- 
mies. OSGi [25] proposes an open service platform for the delivery of applications 
and services to all types of networked devices. Service discovery is performed by 
querying the name or the type of a service. OSGa [14] focuses on integration of grid 
computing paradigms with web services technologies. The service is advertises by its 
service information (i.e. name, type) in the registry. By retrieving this service infor- 
mation, the user can discover services. 

Klein [19] discusses several categories of service discovery technologies and their 
limitations for the quality of the service discovery result. According to Klein’s cate- 
gories, the traditional service discovery approaches are either keywords-based or 
table-based and they don’t take into account the contextual information. As discussed 
in the introduction this leads to low quality of the retrieved results. 



2.2 Context-Aware Service Discovery 

This section presents the existing approaches that consider the contextual information 
in the service discovery process. It also discusses the problems of using contextual 
information in those approaches. 

The Cool town [15] project allows users to discover services that are in the user’s 
vicinity. In this approach the location of the user and the service is used to derive that 
the user is in the service area. This way, services that are close to the user are returned 
by the service discovery mechanism. The context toolkit [8] is a development toolkit 
that provides functionality to discover services using contextual information. It allows 
for describing services by means of white and yellow pages that include contextual 
information. The platform for adaptive applications [10] proposes architecture for 
applications that adapt their behavior according to the context of the user. The plat- 
form enables discovery of context providers by the type of context they advertise. 
This contextual information is used to adapt the application behavior. The CB-Sec 
project [20] provides functionality to discover services that are in the vicinity of the 
user. This approach takes into account the user and service capabilities in the service 
discovery process. 

The contextual information is highly interrelated and has many alternative repre- 
sentations [27]. This makes it difficult to interpret and use. Context providers and 
context consumers (e.g. service providers or requestor) may have different under- 
standings of the same contextual information. This leads to misinterpretation of the 
information, which in turn leads to misunderstanding of the user goal and therefore 
poor discovery results. 
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2.3 Ontology-Based Service Discovery 

As we said earlier, shared understanding on the concepts, used to describe services 
and contextual information, is crucial to ensure high quality service discovery results. 
The required, shared understanding can be provided by the use of ontologies [11]. 
There are several approaches that use ontologies in the service discovery process. 
However, none of them considers the use of contextual information in the service 
discovery process. 

OWL-S [29] is an OWL [31] service ontology that can be used to semantically de- 
scribe services. It allows specification of services in terms of their inputs, outputs, 
conditions, that have to hold true before the service execution (called preconditions in 
OWL-S terms), and post-conditions, that represent the state of the environment after 
the service execution (called effect in OWL-S terms). COBRA [7] divides the world 
into different application domains. Each domain is specified by its own ontology that 
provides shared concepts and relations for service discovery. OntoMat [2] uses on- 
tologies to map the concepts used by the service requestor to the concepts used by the 
service provider. This way, those concepts can be compared and reasoned about. 
CBSDP [18] is a service discovery protocol for ad hoc networks. CBSDP uses on- 
tologies to interpret the data exchanged during service execution. 



3 Our Approach 

We argue that the use of contextual information in the service discovery process in- 
creases the recall and precision of the retrieved results. On the one hand, the contex- 
tual information makes the user’s query more information-rich and thereby provides 
means for higher precision of the retrieved results, that is, the context helps to capture 
better the user’s goal. On the other hand, the contextual information can serve as an 
implicit input to a service that is not explicitly provided by the user. This prevents 
filtering out the services that require this input from the user, which leads to higher 
recall of the retrieved results. However, as discussed in 2.2., contextual information is 
very complex and has many alternative representations. Therefore, we propose to use 
ontologies to model such information. The use of ontologies for describing users’ 
queries, service properties and contextual information is advantageous. First, ontolo- 
gies provide a vocabulary for modeling knowledge in a restricted domain. They are 
built by reaching a consensus within a community of interest and thus are a key en- 
abler for seamless knowledge interchange. Second, ontology languages are usually 
grounded with formal semantics such as model theory or description logic. This in 
turn enables unambiguous definitions of compound concepts. Based on these defini- 
tions it is possible to infer new implicit information from present (explicit) informa- 
tion. Finally, the common vocabulary and precise mathematical specification of se- 
mantics open the way to automatic information processing since the information is 
not only understood by humans but also by machines. 
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3.1 Positioning 




Fig. 1 . Positioning of our ap- 
proach 



Figure 1 shows the position of our approach with 
respect to the existing service retrieval approaches 
identified by Klein in [19]. 

We position our approach in the space between 
the concept-based approach and deductive re- 
trieval approach. The deductive approach offers 
higher recall and precision, however, modeling 
service functionality by the means of formal logic 
is sometimes an extremely difficult task. Another 
disadvantage of the deductive approach is that the 
search process is usually very slow due to the high 
computation complexity of the proof process. 



3.2 Overview 

In our approach, we distinguish several high-level components (fig. 2). The inputs of 
our matching component are: the user’s query (i.e. the service request), a set of ad- 
vertised services (i.e. service descriptions), a set of context providers, and the ontolo- 
gies, used by the user, service and context providers. 

In our approach service users, 
service providers and context 
providers achieve a shared under- 
standing by using ontologies to 
which they all commit. Users and 
service providers have associated 
context providers that can deliver 
different types of contextual in- 
formation, for instance, user loca- 
tion or weather conditions in a 
certain service area. To enable 
unambiguous, knowledge inter- 
change, our approach uses do- 
main-specific ontologies. In such ontologies, concepts from a particular domain and 
relations among them are precisely specified. This enables reasoning on the user 
queries, service descriptions and associated contextual information. For instance, 
consider a shop that advertises: sale of ‘music products’. If a user specifies that he 
wants to buy a ‘music CD’, the query and the service description do not match syn- 
tactically. If we employ domain-specific ontologies to derive that ‘music CD’ is a 
‘music product’, we can conclude that the query and the service description match 
semantically. 

We distinguish four different service properties that are handled differently by our 
matching algorithm: 
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■ Service type : Refers to an entry in some ontology or taxonomy of services. 
Example of such an ontology is the UNSPSC 3 classification system. 

■ Outputs: Refers to a concept from a domain- specific ontology that specifies 
the value that this particular service delivers to its environment (e.g. music prod- 
ucts, traffic information, etc.) 

■ Inputs: Refers to a concept from a domain- specific ontology that specifies the 
sacrifice a user is ready to make in order to receive the value delivered by a service 
(e.g. money, effort to fill in a questionnaire, etc.) 

■ Contextual attribute: Represent the contextual information derived from the 
user (e.g. user location) and service providers (e.g. service location). 



3.3 Service Grounding 

To be able to invoke a service after its discovery, in our approach we use a WSDL 
grounding mechanism. WSDL [32] defines sendees as collections of network end- 
points. The abstract definition of an endpoint, called interface , is separated from its 
concrete network deployment, protocol and data encoding through reusable bindings. 

Interfaces are 
abstract collec- 
tions of opera- 
tions that contain 
input and/or out- 
put messages 
which consist of 
message parts. 
Fig. 3 presents 
the mapping 
between our 
service model and 

Fig. 3. Service model and mapping to the WSDL metamodel WSDL meta- 

model. In our service model each service has a service type. This service type is 
mapped to a WSDL interface. The service itself maps to an operation in this interface 
(e.g. SellMusicCD). The inputs and outputs of the service map to messages in WSDL 
whereas concepts map to message parts. The following example outlines our 
grounding mechanism. 




<operation name= " SellMusicCD "> 

<input message="credit_card"/> 

<output message="CD" /> 

</operation> 

<message name="credit_card"> 

<part name="type" payontology: output^ "payontology : #CreditCardType " /> 
<part name="card" payontology : output^ " payontology : #Card" /> 

<part name="expire" payontology : output= " payontology : #ExpireDate" /> 
</message> 



3 http://www.un-spsc.org 
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3.4 Matching Algorithm 

Our approach matches a user query with a set of available service descriptions. The 
result is a set of service descriptions that semantically match the user query. To rate 
the matches we defined a quality measure called matching degree. 

Matching Degree 

Consider a user request R and a service description S. To rate how relevant particular 
match between R and S is, we use the number of service properties (i.e. type, inputs, 
outputs and contextual attributes) from the request that are not present in S. Based on 
those missing properties we classify the match in five different categories, defined by 
Li [21] (fig. 4). 

The first category indicates an exact match. The request has the same properties as the 

service description, i.e. there are no missing proper- 
ties. This is the best possible match. The second 
category is called plug-in match, that represents the 
second best match. It indicates that the service is 
capable of more than the requestor wants. The third 
and fourth category, called subsume match and in- 
tersection match, respectively, indicate that the 
service can only partially provide what the user 
wants, i.e. the number of missing properties is bigger 
than zero. The fifth match category indicates a dis- 
joint match, i.e. the request and the service do not 
share any properties. 

Our approach uses this initial classification to 
further classify matches in three types of matches 

that are useful for the user: 

■ Precise match : Exact and Plug-in matches. The service is capable of providing 
the requested functionality or more. 

■ Approximate match : Subsume and intersection matches. The service is capable 
of providing part of the requested functionality. 

■ Mismatch : Disjoint match. The service is not capable of providing the re- 
quested functionality and will not be returned to the user. 

Algorithm 

The goal of the matching algorithm is to classify the available set of services using 
the service request into the three previously defined matching types. This is done in 
four steps (fig. 5). 

The starting point of the matching process is a set of all service (S) available to the 
matchmaker (e.g. n). The first step will filter out those services that are not of the 
desired service type provided in the user request (R). This results in a smaller set of 
services (e.g. n-k) with service type R t . The second step will filter out all service de- 
scriptions that do not have the desired service output. Again, this results in a smaller 
set of services (e.g. n-k-m) that can provide the requested output R 0 . The services of 




|mprop| = 0 |mprop| = 0 |mpfop| > 0 

R = S Rtz. S S^R 





|mprop| > 0 Impropl = |prop,| 

Rr\S*(p Rr\S = 0 



Fig. 4. Match categories 
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this set are then queried for the 
inputs (s;) they require. If the 
required inputs are provided by 
the user or can be provided by 
the context providers (e.g. when 
the service needs as input the 
user location that is not 
provided by the user but by the 
user location context provider) 
the match is classified as per- 
fect. Else the match is classified 
as imperfect. The final step orders the two sets using the contextual attributes (dis- 
cussed in the next section). All phases are represented in the following matchmaking 
algorithm. 

Matching (R, S) { 

S' = query_Registry (R t , S) 

S'' = query_Registry (R OI S') 
forall s in S' ' do { 

Si = query_Inputs (s) 
if provided ( Si, Ri) then { 

Precise . append ( s ) 

} 

else { 

if query_ContextProviders (userlD, 
missing_Inputs (Si, Ri) ) then 

{ 

Precise . append ( s ) 

} 

else { 

Approximate . append ( s ) 

} 

} 

1 

P = order_with_ContextualAttributes (Precise) 

A = order_with_ContextualAttributes (Approximate) 

return result (P, A) 

} 



Contextual Attributes Model 

Users can define some preferences about certain properties of a service they want to 
discover. This can for instance be the preference nearby that defines that the user 
wants to retrieve a service close to him. We call these service properties/user prefer- 
ences “contextual attributes”. The contextual attributes are defined in a simple rule: 
Attribute -definitions Statement. The statement defines the meaning of the attribute 
(e.g. nearby -definitions distance (userposition, serverposition) < maxdistance). 
These contextual attributes are used to order the sets of returned matches. 

We use a clustering mechanism to rate services based on the preferences they have. 
For that purpose, we use concept lattices [13]. ‘Concept lattices’ is a mechanism used 
in formal concept analysis. It can be used to study how objects can be hierarchically 
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grouped together according to their common attributes. The starting point is a concept 
model which consists of a triple (G,M, I). G is set of objects, M is a set of attributes 
and I is a binary relation between them ( / Cl GxM ). A common used representa- 
tion of this model is a cross table (fig. 6a). Each object is one row in the table while 
the attributes are the columns. The binary relation is presented by a cross at the inter- 
section of a row and a column. The lattice table is the basis for a lattice line diagram 
that visualizes the attribute communality of the objects (fig. 6b). 
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Fig. 6. Concept lattices 



This is a hierarchical diagram which presents the most generic objects at the top 
while getting down in the diagram the objects get more specific (i.e. have more at- 
tributes). A node in the diagram is called concept and can contain objects that share 
the same attributes. Such a concept shares the attributes from its parents in the dia- 
gram. The top node is a set of objects that contains no attributes. One level down 
object 1 is encapsulated by a concept that contains attribute 1 . Again one level down 
we see that object 3 has a relation with the concept containing attribute 1 and with a 
concept containing attribute 2. Therefore, object 3 has attribute 1 and attribute 2 and 
shares attribute 1 with object 1. Object 2 has attribute 3 and shares attribute 2 with 
object 3. The bottom node contains objects that have all attributes (in this case 
empty). This model is analogues to our contextual attribute model, where a service 
(object) has some contextual attributes (attributes) (fig. 7). 
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Fig. 7. Lattice cross table 



The request and all retrieved services descriptions are added to this table as objects 
(rows). The preferences are evaluated for the services (cross) and added as attributes 
(columns). From this table a lattice line diagram is calculated (fig. 8). 
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This diagram should be read from the top to the 
bottom. A child node shares the attributes of its 
parents (e.g. service 5, service 9 and request all 
have attributes nearby, train, open, price range2 
etc). So, by reasoning on the position of services 
related to the position of the request an ordering of 
services can be made. Services positioned higher in 
the diagram than the request miss preferences. The 
higher the services are positioned the more prefer- 
ences they miss the lower in the resulting list they 
are ordered. 



Fig. 8. Lattice line diagram 



4 Implementation and Evaluation 

Our approach was implemented as part of an experimental platform [12]. The plat- 
form provides the environment for mobile context-aware application to use third 
party content services (i.e. web services). The platform is implemented using Java 
technology. Parlay X [26] is used to interact with 3G network services while the 
AXIS framework [4] is used to interact with the third party content services. The 
client side is implemented using Personal Java and runs on a variety of embedded 
devices (e.g. smartphone, PDA). 

Our approach is embedded in the matchmaker component of the experimental plat- 
form. Service advertisements are stored in MySQL databases as persistent Jena [17] 
models, and retrieved by executing RDQL [16] statements. The approach is imple- 
mented modular by encapsulating it in webservices. Therefore, the approach is not 
solely suitable for handling explicit requests by the user, but it is also able to deal 
with implicit requests, for instance, by an ambient intelligence environment. 

We evaluated the approach using the implemented prototype. One of the evalua- 
tions issued queries using the prototype. Recall and precision rates where calculated 
and compared to recall and precision rates when using keyword based mechanisms. 
As an example, a query containing homonyms showed a gain of recall and precision 
of more than fifty percent. Further reading on the evaluation can be done in the mas- 
ter thesis [6]. 

5 Conclusion 

In this paper we discuss the shortcomings of existing service discovery approaches 
and propose a novel approach [6] to overcome some of them. Our approach 4 uses the 



4 This work is part of the Freehand AWARENESS project (http://awareness.freeband.nl). 
Freehand is sponsored by the Dutch government under contract BSIK 03025. 
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available contextual information about a particular user or service provider (e.g. user 
location or service opening times). In addition, it uses ontologies to semantically 
express user queries, service descriptions and the contextual information. 

The use of contextual information in our approach resulted in higher quality of the 
retrieved results. On the one hand, the contextual information makes the user’s query 
more information-rich (e.g. by adding extra information about the user’s preferences) 
and thereby increases the precision of the retrieved results. On the other hand, the 
contextual information serves as an implicit input to a service that is not explicitly 
provided by the user. This allows our matching algorithm to select services that 
would be filtered out otherwise, which leads to higher recall of the retrieved results. 

Besides the use of contextual information, we showed that use of ontologies in the 
context-aware, service discovery has many advantages. First, ontologies provide a 
shared vocabulary for specification of user queries, of service descriptions and of 
contextual information. This provides a basis for matching of meaningful user queries 
and meaningful service descriptions rather than just syntactic textual descriptions. 
Second, we used OWL, which is grounded with formal semantics of the Description 
Logic [5] . This allowed us to define unambiguously compound concepts and to rea- 
son about them. 

Finally, the use of concept lattices for clustering services with similar attributes 
provided a convenient way to order services by their relevancy for the user. However, 
the designed mechanism is just a first step on using concept lattices in service discov- 
ery. Our future work includes a broader inspection of the use of concept lattices in 
service discovery. 
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Abstract. We present an access control architecture that restricts access of train 
and tourist office services to travelers that are really on the train. For this pur- 
pose location and velocity information are used as alternative authentication 
and authorization credentials. The ambient intelligence of our transparent ac- 
cess control mechanism manifests itself as a pool of verification methods of in- 
creasing strengths. These verification methods are based on the proximity prin- 
ciple and relational contextual history information about WLAN access points, 
fellow train travelers, the train itself and its conductor or driver. 



1 Introduction 

Train carriages are nowadays full of laptop wielding passengers working to improve 
productivity. Most of these passengers would be very keen to use a high-speed wire- 
less network if one were available. Furthermore, they would welcome the opportunity 
to transparently access the Internet and their corporate network, to send and receive 
email, do some gaming, or to check travel information. The “Mobile and Wireless” 
project (www.telin.nl) aims just to deliver such services via a WLAN infrastructure 
consisting of several fixed access points along the railway track to various categories 
of end-users (e.g. students, tourists, business travelers) as described below: 

“Brenda from Spain is traveling by train from Utrecht to Enschede. Just passing 
Hengelo, she switches on her GPS- and WLAN-enabled laptop. The wireless network 
card on her laptop detects and associates her to a nearby WLAN access point. A 
service portal containing links to train and tourist services appears on her screen. 
First, she launches the Tourist Travel Scheduler. She arranges a taxi that will take 
her from Enschede Central Station to her hotel. Next she starts up the interactive 
Night Life video provided by the WLAN-owner ProRail. When she is sitting in the 
lounge of her hotel in the evening, Brenda longs for more Night Life. She enters the 
URL of the portal again to watch the video again. Unfortunately, access is denied. ” 

Like many other people, Brenda does not want to be bothered with authentication 
issues. It is not desirable for a common train traveler to continuously authenticate 
himself by means of username and password or digital certificates for accessing 
services. Evidently, such user-ID authentication schemes are no viable solution for 
train travelers and are hardly usable in dynamic environments where access to infor- 
mation, services or network should be done transparently and seamlessly. Moreover, 
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checking user identities in these standard ways involves a complex authentication 
infrastructure for a potentially large number of users. The latter is undesirable from 
the perspective of a WLAN -owner or service manager. Another reason is that the user 
identity is often not only irrelevant but must also frequently be kept secret. Still, Pro- 
Rail wants to restrict the access to train travelers only. Context-based access control 
schemes may form transparent, seamless and privacy protecting alternatives to stan- 
dard authentication schemes not only in our scenario but also in medical as well as 
domestic settings: 

• Medical records of a patient in a hospital can be read by a clinician when (s)he is 
in the proximity of the computer [ 1 ] . 

• Permission to talk to a resident in other room via the intercom may depend on the 
activity the resident is currently involved [2]. 

In this paper we will investigate the feasibility of using context information, and in 
particular location and velocity, of train passengers for controlling their access to 
services offered via a WLAN. We will propose an access control solution and address 
the advantages and disadvantages of this particular type of context-sensitive access. 
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Fig. 1 . GPS-location 
(dotted line) and 
velocity (inset) on 
railway line Hengelo 
- Enschede. In the 
magnified window 
the squares are GPS- 
location measure- 
ments and the little 
arrows indicate the 
direction of the train 
for each square. 



2 Context-Based Access Control in Practice 

For our purposes we use basic GPS to determine the location. Fig. 1 shows the 
location of a train traveler like Brenda obtained from a GPS-device inside the train. 
From this experiment we may conclude that GPS -connectivity is available inside the 
train, and, more importantly, that the information obtained from GPS is sufficiently 
accurate to be used as credentials for access control. However, in contrast to pass- 
words or other shared secrets, location information is public information and there- 
fore apriori not credible enough to allow access to more valuable services such as 
Internet access and third party services. More is needed. 

An additional context parameter could be the actual velocity of the train and thus 
movement of the train traveler. This would exclude undesired access to services from 
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persons standing still along the railway track, the so-called “parking lot” syndrome. 
Still a person could send a fake access request message that includes valid coordinates 
and a guessed velocity of the train. Verification of additional contextual history in- 
formation is therefore needed for even stronger access control to services on the train. 

The basic idea is to verify a user’s context via a known reference point in the 
proximity of the user. There are several ways to do this. One way is to make a corre- 
lation with the GPS-obtained location of the user and the location of the WLAN ac- 
cess point the user is associated with. This solution, however, does not exclude the 
parking lot issue because mismatch between the network coverage and the location 
boundaries for user access could be unacceptable. It also requires data exchange be- 
tween link layer and higher-level protocol layers of the WLAN, which is difficult to 
realize. 

Our solution is to compare the user’s context with that of other ‘online’ travelers in 
the train; they should have provided similar contextual history information. Here we 
have to be almost sure that the other travelers are sitting in the same train. So, veloc- 
ity, direction and location must be the more or less the same for these travelers for 
larger time-spans. Optionally, the train conductor or train itself can be equipped with 
a GPS-device. In this case, there is always a trusted reference point available in the 
train. 
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Fig. 2. Access 
control architec- 
ture and access 
control protocol. 



We discern in Fig. 2 the following access control components and related protocol: 



1 The GPS-enabled device of the train traveler (MN) requests for a local service 
offered by a Service Provider (SP) via a portal. 

2 The SP asks the Authentication Server (AS) to start up an authentication session 
with the MN. 

3 The AS asks the MN for contextual history information (longitude, latitude, ve- 
locity, direction). Note that a secure server-side SSL channel can be set up with 
the MN to guarantee the confidentiality and integrity of the context information. 

4 The MN sends the requested information to the AS. 

5 The AS checks via the Context Database Management System (DBMS) whether 
the location coordinates and velocity vector given by the MN are in agreement 
with the access control policy: is the MN really on the train? 

6 The DBMS collects and computes all relevant contextual information of train 
travelers and personnel. The DBMS checks whether the context information pre- 
sented by the AS is valid: it compares the context information with that of other 
travelers or train personnel (MNs) on the same train. For this purpose the DBMS 
may have to request one or more of the previously authenticated fellow MNs 
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(preferably that of train personnel) for an update of contextual information. Next 
he has to compare this with the claimed context of the new MN. 

7 A positive/negative verification response will be send back to the AS. 

8 The AS informs the SP about the outcome of the authentication process. 

9 The MN is granted access to the service since the verification was positive. Be- 
cause of the ever-changing context, the access control process has to be repeated 
in time regularly. 

Rogues can be readily denied access even when they fall within the service range 
of the passing train - they simply lack the knowledge about the fellow train travelers’ 
MN positions over time and that of the train. Even discrimination of travelers in 
passing trains can be readily put in place by means of GPS (see Fig. 1). Moreover, addi- 
tional context parameters, like the current travel scheme and velocity history of the 
train for a specific track (see also Fig. 1), can be taken into account to enhance the 
overall access control level. Empirical data nicely proofs these assertions about our 
service access control implementation. 



3 Conclusions and Work in Progress 

The ability to sense and exploit contextual information to augment or replace tradi- 
tional user attributes for the purpose of authentication and access control is critical to 
making security less intrusive and more transparent. Location and velocity obtained 
from GPS provide accurate context information for allowing controlled anonymous 
access to services (see also [3]). Context information, however, usually consists of 
public information making authentication intrinsically difficult. Verification of con- 
text information provided by a user is therefore essential prior to granting access to 
particular services based on this information. Intelligent usage of ambient information 
like the velocity of fellow travelers or the train itself facilitates efficient verification 
of context claims. In the Mobile and Wireless project we will continue our research 
on context aware security and are currently implementing the proposed solution. 
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Abstract. The next generation sportsmen will be one that can be monitored and 
steered by a coach on their individual performances. Not only the physiological 
performances will be monitored, but also the contextual data, like position of 
sportsmen. Based on performance models, key parameters that should be 
monitored for rowing and soccer were identified. These parameters must be 
acquired real-time, processed real-time and given real-time feedback on. In this 
short paper, we will discuss the methods and technology developed to achieve 
these goals. The work presented in this short paper is still in progress. 



1 Introduction 

Today, in order to optimise the performance of sportsmen above the already high 
level of nowadays, sensor monitoring and coaching of sportsmen is becoming more 
desirable [1, 2], Not only the monitoring and coaching of sportsmen performed by a 
coach, but also the personal monitoring by the sportsman is gaining importance. In 
this way, sportsmen can train and follow their condition and performance, and this 
encourages and motivates them to keep training. 

Monitoring sportsmen in the way it is done nowadays, engenders several problems. 
Firstly, different types of data, like heart rate, position and movements of the 
sportsmen, and other physiological and environmental data, are acquired separately 
(different locations and moments). Secondly, the interpretation of the large amount of 
data is mostly done after the training or match. In this way, no real-time performance 
can be monitored, so no real-time coach intervention can be given, apart from the 
traditional visual observation and shouting of suggestions. In professional soccer, the 
state of the art is typically a multi-language evaluation of tactics with the international 
players using chalk and slate. Because of the labour-intensive process usually only 
one data type is collected per session. A third problem regarding the monitoring 
devices is that they are mostly not based on a comprehensive idea of what defines 
performance (performance models). The importance of performance models is to 
enable steering of training and performance of sportsmen by monitoring only the key 
parameters for each sport. 

TNO, a contract research and development institute in The Netherlands, is developing 
a new personal monitoring device and a remote coaching device. These devices are 
developed for different sports, like soccer, rowing and ice-skating, and based on key 
performance parameters. This paper describes a running project on soccer. 
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It is hypothesised that: 

1. Feedback systems based on scientific performance models improve the 
effectiveness of training and lead to a higher performance while preventing 
injuries and over-training [3, 4, 5, 6, 7]. 

2. Performance models with a given soccer team need to be based on individualised 
and team characterisations to obtain highest predictive value 

3. Automated interpretation and system generated suggestions for the coach makes 
the overload of data during the game (many players, several parameters, real 
time) manageable. 



2 Methods and Materials 

In order to optimise training and thereby performances of soccer players, key 
parameters are acquired real-time, are processed, after which feedback is given. This 
is done in co-operation with a professional Dutch soccer club [8], Based on several 
sessions with coaches, tactics and the condition of soccer players were defined as the 
two main issues in performances of soccer players. We will focus on tactics here. 

A key performance parameter within tactics can be derived from the position of the 
player. Therefore, the position of the player was firstly measured making use of a 
video system placed on several tower wagons. The disadvantage of these video 
images is that they could only be analysed afterwards. The video images were used to 
make a presentation of the position of the players both in 2D and in a Virtual 
Environment (VE). In order to map the position of the players in real-time also, a new 
measuring system (Local Position Measurement System, LPS) is currently integrated 
in this viewing environment [9, 10]. In this new system, each player wears a RF tag in 
his or her clothes. Furthermore, several measuring stations are placed around the 
soccer field. The measuring stations send Radio Frequency (RF) signals and can also 
receive the answer from the tags at a rate of 300 Hz and an accuracy of 5-25 cm in an 
open soccer field. If more accuracy is needed, more measuring stations can be placed 
around the field. In this way, the real-time position of each player is known, and can 
be translated into x, y (mainly for position) and z (for jumps) co-ordinates. These co- 
ordinates are translated through a software program for two different devices. The 
first one is a tablet on which you can follow the players real-time in 2D. The tablet is 
an intelligent device which is used by the coach to monitor the position of each player 
and to make analysis of the patterns of the game (covered defence area, overlap with 
competitors), interpret them and intervene (see figure 1). The system is context 
sensitive as it suggests interventions for players in relation to positions and 
movements of other players and opponents. The additional value of this technology 
above the current practise using video images is that the coach can provide real-time 
feedback to his players. The second device to which the co-ordinates are sent consists 
of a virtual environment technique with which you can reassess the training or match. 
It consists of a VE cave in which you position yourself in the match or training at a 
position and moment of choice (see figure 2). In this VE cave, the moments of the 
match or training can be re-experienced by the players and they can evaluate them 
with their coach. 
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TNO developed a new solution for feedback, namely a tactile vest. Instead of 
shouting and waving directions, the coach activates vibrating elements in a vest worn 
by the player. The coach can activate each element and make the player aware of the 
direction he should be heading for or looking at. The vest, in its first pilot, showed to 
be intuitive but somewhat overloaded in functions. Also, the wearing comfort could 
be improved. This is done in current work. 




Fig. 1. 2D feedback on position , direction of movement (stick) and area covered (triangle) 




Fig. 2. VE bird’s eye feedback on relative position in the game (VE cave) 



The hypothesis were validated using a structured expert review. Four top coaches 
were asked to identify important moments in a training match on video. These key 
moments were evaluated with traditional methods (chalk and slate, conversation, pen 
and paper). Then the same moments were evaluated using the 2D tablet and the VE 
cave. After this, the coaches reviewed the usefulness and usability of the systems. 



3 Results 

The general coach satisfaction about the proposed systems was high. With regard to 
the hypotheses: 

Ad 1) Effectiveness of training based on feedback systems: the possibility to 'play 
back’ a game and to visualise tactical game patterns was highly appreciated. This 
allows for a new level of training instructions. 
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Ad 2) Individual performance models: The game patterns have not yet been 
analysed to the extent of characterising individual and team behaviour and 
capabilities. This is currently under development. 

Ad 3) Automated interpretation of tactical game patterns gave insights in the game 
that where not feasible before. 



4 Discussion 

Within a traditional sport like soccer, there is a danger in the acceptance of 
technology in training by sportsmen. 

• Comfort: the sports-persons are regularly aware of the equipment, which is 
somewhat distracting and annoying. 

• Control: Some soccer players may feel that they are being watched (‘spied 
upon’), which may influence their performance negatively in stead of positively. 

This leads to the following additional hypothesis 4: ‘Remote coaching of sportsmen 
does not negatively affect the relationship between coach and sports-person 1 . 

Though all hardware ingredients for ‘ambient intelligence’ are present, the active role 
of the environment (audience, remote audience, field equipment) has not yet been 
fully explored or exploited. This presents a pleasant creative challenge. 
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Abstract. From a User-Centered Design perspective, technology pushes are 
often regarded to as negative because the ideas behind these pushes not always 
address user needs, often causing products to fail in the market. Feasibility 
studies help close the gap between technology pushes and demand pulls. By in- 
viting users to witness feasibility studies in an early stage of a design process, 
participants not only are able to provide input long before full functionality has 
been developed, but it also allows them to make that important step from 
imagining what an Ambient Intelligent product can do for them in their daily 
lives, to actually experiencing it. 



1 Introduction 

The Ambient Intelligent vision of having electronic environments that are aware and 
responsive to the presence of people [1] has set a direction for companies to design 
products that bring this vision to life. From a User-Centered Design (UCD) perspec- 
tive, the technology developments behind these products can range from being tech- 
nology pushes to being demand pulls. A technology push is driven by ideas coming 
from the creative minds of developers trying to perfect their technical solutions in the 
absence of specific needs that customers may have, attempting to find some minimum 
use case that justifies its existence [7] [9], This is contrasted by the demand pull [9] 
which is driven by user needs and requirements. 

UCD techniques for gathering requirements and participatory design, involving 
users early on in the design process, include the use of Cultural Probes [4], Technol- 
ogy Biographies [2], Technology Probes [6], Role-Playing and Low-fi Prototyping 
[8], and Ethnographic Design Research [3] [5] among others. While these techniques 
are successful in the process of gathering user requirements for new designs and con- 
cepts, they often rely on the participants’ abilities to imagine what an ambient intelli- 
gent system is and what it can do for them in their daily lives. By inviting participants 
to witness the results of a feasibility study conducted in HomeLab, we allowed par- 
ticipants shift from imagination to experience. 
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1.1 Feasibility Study in HomeLab 

Early on in the process of designing an ambient lighting system for the bathroom, our 
team had to answer a basic question: would people want to have functional and ambi- 
ent lighting in their bathrooms to support their activities? To answer this question, we 
conducted a user study that included interviews, using Cultural Probes [4] and invit- 
ing users to participate in a workshop in HomeLab. 

Philips Lighting conducted a feasibility study by installing a demonstration of 
state-of-the-art lighting in the bathroom of HomeLab in Eindhoven, the Netherlands. 
The demo was made to show different modes (Night, Wake-up, Day, Relax, Beauty 
light) that created functional as well as ambient atmospheres using light intensity, 
colour temperature of white light, coloured light and transitions. Different light 
sources (LEDs, incandescent, halogen) were installed, which produced both coloured 
and white light (ranging from cold to warm in colour temperature), as well as differ- 
ent intensity levels combined with dynamic changes. This demonstration was used as 
the starting point for the workshop to explore together with the participants the differ- 
ent options that such a system could provide. 



2 User Study 

We conducted a User Study in which we introduced participants to Cultural Probes 
that they would use for a period of one week. Later on, they were invited to partici- 
pate in a workshop in HomeLab where they were able to witness the feasibility dem- 
onstrator of the bathroom lighting system. 

The goal of this study was to gain an insight into whether people would want at- 
mospheric and functional lighting to support their daily activities in the bathroom and 
if yes, how. This required recruiting a diverse population of participants to address 
different needs. 

There were practical reasons for splitting the user study in these two very distinct 
phases, namely the first part was focused on gathering knowledge on users’ activities 
in their homes while the second part would be dedicated to presenting the demo thus 
allowing participants to experience an ambient lighting system. 

Because in the first part of the study participants would only be asked to imagine 
what ambient intelligent lighting could do, it was interesting for us to see their reac- 
tions before and after seeing the demo. Our assumption was that their answers would 
somehow differ. 



2.1 Participants 

We aimed to recruit at least ten participants for our study to ensure different view- 
points. We focused on five families (couples) mainly because they would provide a 
richer look into simultaneous sharing situations in the bathroom. We wanted to know 
whether people actually share their bathroom at the same time and if lighting should 
support these activities while sharing. Another question that families could help an- 
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swer is how would such a lighting system react to the presence of multiple users. The 
second criterion was to find families that had medium-to-large bathrooms in their 
house. An average Dutch house (where the study took place) has relatively small split 
bathrooms (compared to American households); usually a separate toilet and a small 
sink on the ground floor and a shower and sink on the first floor. Such small houses 
provided only limited possibilities to look into sharing situations or have spaces for 
multiple-lighting . 




Fig. 1. This image is a collection of pictures that were made by participants and sent back to us 
as part of the Cultural Probe study 



2.2 Cultural Probes 

The first part of the User Requirements study consisted of sending Cultural Probes to 
the participants’ homes. These probes consisted of a diary, which contained assign- 
ments to answer questions , indicate events on timelines, and activities, and a dispos- 
able camera to allow participants to take pictures for the period while they were fill- 
ing-in their diaries. The Cultural Probe study took place in January 2004 in the par- 
ticipants’ homes. 

The primary purpose of the Cultural Probe study was to gain insight into what 
people do in the context of the bathroom, including activities, places and objects used 
while performing these activities. The second purpose was to start a discussion on 
lighting, namely their current lighting conditions in the bathroom and identify any 
problems they may be experiencing with light. We were also interested in gathering 
input from participants on whether they would be open to experience light in new 
ways in their bathroom, through coloured light, changes in intensity and transitions, 
and whether this type of lighting should support their activities in a functional way. 
Additionally, participants were asked to take pictures to visually support and high- 
light some of the experiences they had while filling-in the diaries. 

We conducted an interview in the participants’ homes where they were introduced 
to the diaries, going generally through every page to answer questions that could 
arise. They were then given disposable cameras and asked to keep track of their pic- 
tures in a picture record table. The facilitator with the help of the participants made a 
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floor plan drawing of their bathroom so they could keep track of the places where 
they were performing their activities as well as sharing situations. 
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2. Questions 




Fig. 2. These are samples of pages from the Diaries that were filled-in by participants as part of 
the Cultural Probe study 

The main advantages of this way to elicit requirements included collecting data 
from participants over a period of one week, something that formal interviews do not 
allow because they usually last no longer than a couple of hours. A one-week period 
allowed the participants to reflect on what they were being asked as well as on the 
answers they provided on the previous days. Another advantage of the probes was 
that they provided better conditions for participants to answer questions on a topic 
that most of them would not feel comfortable to give answers to during a formal in- 
terview. Participants had the time to think on what they wanted to answer thus pre- 
venting uncomfortable face-to- face situations. 
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Fig. 3. This image is a collection of pictures that were made during the Workshops while par- 
ticipants were experiencing the different modes. 



2.3 Workshops 

The second part of the study consisted of Workshops that took place in February 
2004 in HomeLab. The feasibility study conducted by Philips Lighting resulted in a 
demonstrator of advanced bathroom lighting that was presented as a series of sequen- 
tial modes that correspond to different times of the day as well as different activities. 

The main objective of the Workshops was to allow participants to witness and 
have a first-hand experience with the concepts proposed for bathroom lighting, as 
well as coloured lighting, transitions and changes in intensity. In this way, we were 
able to gather requirements from the participants through their reactions on the sys- 
tem as it was implemented, as well as trigger their imagination for new ideas. The 
second purpose was to encourage participants to modify settings to fit their specific 
needs/wishes. 

Participants were welcomed to the FIomeLab by the facilitator who led them to the 
bathroom. Participants were given an overview of what the Workshop was about and 
were informed that the session would be recorded on DVD (video and audio). 

The Workshop started by inviting participants to take a seat after which lights were 
switched off to allow their eyes to get used to complete darkness (there are no natural 
light sources in this bathroom). 

The bathroom lighting demonstrator was presented in a sequence of modes that in- 
clude a Night Mode, a Wake-up Mode, a Day Mode, a Relax Mode and the Beauty 
Light. Scenarios were used to give participants a context of the occasion and time in 
which each mode would most probably be used. 



3 Study Results 

This workshop proved to be a key point in demonstrating the difference between 
asking participants to imagine an ambient lighting system for the bathroom and let- 
ting them actually experience such system. 



From Imagination to Experience 



97 



3.1 What People Said on an Ambient Lighting System Before Witnessing the 
Demonstrator 

Although the main focus of the Cultural Probes study was to explore activities inside 
the bathroom, two specific questions on lighting were asked, namely should lighting 
functionally support their activities in the bathroom and would coloured lighting or 
changes in intensity be part of this support. Their previous experiences with such 
lighting systems were limited to the use of candles in the context of the bathroom. At 
this point, participants said they would not like to have such a system in their bath- 
rooms. 

While most of the initial assumptions related to the activities performed in the 
bathroom by users were confirmed, the most important finding from the diaries was 
the request for functional lighting that supports activities in the bathroom, mainly 
through having more sources of white light in their bathrooms, with the option to dim 
light. The other important finding was the unanimous reluctance to have coloured 
lighting in the context of the bathroom. 

At this point in the study, it was very difficult for participants to have a clear idea 
on the full range of possibilities that an advanced lighting system could offer. The 
depth of their ideas was limited by their abilities to imagine such a system at work. 

Although they identified other important aspects in their use of the bathroom, 
namely sharing , routines and differences in use between weekdays and weekend, 
participants did not proactively initiate remarks on how these aspects could help make 
an ambient lighting system suit their needs in a better way. 

3.2 What People Said on an Ambient Lighting System After Witnessing the 
Demonstrator 

Participants were invited in HomeLab to experience the lighting system demonstrator 
installed in the bathroom. By witnessing the demonstrator, participants were able to 
shift from imagination into actually experiencing what coloured lighting, changes in 
intensity and transitions, actually meant. All participants expressed their desire to 
have such a lighting system in their bathroom which is a remarkable opinion change 
as expressed earlier during the Cultural Probes. 

The overall impression about the different Modes presented during the demo was 
very positive. All participants found the concepts behind the six Modes (Night, 
Wake-up, Day, Relax, Artistic and Beauty Light) attractive. From the very beginning 
of the demonstration, after all the modes were presented, all participants were enthu- 
siastic about the new modes. Participants generated more ideas on what they would 
want from such a system. 

The feasibility study not only helped them see what coloured lighting could do in 
the bathroom. It also triggered their imaginations allowing us to gather requirements 
on lighting and the modes themselves as well as on other specific aspects such as 
colour, indirect lighting, seasonal effect, duration of transitions, triggering and con- 
trolling modes, and colour temperature of white light. 

One aspect where participants had been unable to provide input was the question 
on how such a system should react to situation where two or more users share the 
bathroom simultaneously. After the demo, all families proactively engaged in a short 
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friendly discussion on how the system should react to sharing situations, without 
having the facilitator explicitly ask participants about it. The general conclusion from 
these discussions is that the problem should be solved by the couple according to 
common social rules and not by the system. Participants said the way the problem 
would be solved would depend on the circumstances of the encounter (activity, mood, 
day, duration, level of disruption depending on how much the overall lighting would 
actually be affected). 



4 Discussion 

There are several issues that should be taken into account considering the outcomes 
of the current study. First of all, there is a question on what are the factors that influ- 
ence this strong shift from expressing no need in having coloured lighting and dy- 
namic transitions in the first phase of the study, to being so unanimously enthusiastic 
about the demonstrator showing these features. The main reason for such a strong 
difference in opinion may be the fact that in the first case it was very hard for partici- 
pants to imagine what coloured lighting could look like or could do for them in the 
context of the bathroom. In that sense, the feasibility study conducted by Philips 
Lighting which resulted in a demonstrator for ambient lighting in the bathroom 
proved to be crucial for breaking the barrier between imagination and direct experi- 
ence. 

The experience we had in the current study with the advanced lighting system can 
be true also for other ambient intelligent systems. Since people are not familiar with 
the possibilities these systems have to offer it would be difficult to collect valid re- 
quirements for such systems if people would need to imagine those systems. Their 
opinion can change 180 degrees after they would have experienced such systems, 
similar to what happened in our study. 

The different nature of diaries and workshops may also have an effect in the out- 
come of this study. The diaries may have felt like a cumbersome and boring school- 
like activity where they have to do their homework for a period of one week. This 
could have had a negative effect on the breadth of their answers. On the other hand, 
the workshops are mainly based on participants verbalizing on what they are seeing. 
Their ability to express what they like and dislike may have an influence on the final 
output. The limited time, as well as the social interaction during the workshops may 
have prevented participants from getting a full picture of the system. 

There was a very clear positive effect on the participants’ attitude after seeing the 
demonstration and remained constant throughout the workshop. The overall attitude 
from the participants towards the entire activity may be influenced by the nature and 
duration of the task. Here again, this could be a matter of discussion since cultural 
probes are a one-week task where people keep track of their daily activities in the 
bathroom compared to a three-hour workshop where people are stimulated by seeing 
something novel and that will only require them to concentrate for a short period of 
time. 
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5 Conclusions 

When gathering requirements for new Ambient Intelligent products, allowing partici- 
pants to go from imagination to experience by inviting them to witness and evaluate 
advancements in technology at an early stage of the development of the product 
makes a big difference. We believe the findings of our study confirmed the impor- 
tance of conducting feasibility studies to help bridge the technology push-demand 
pull gap. By inviting users to give their views at this early stage of the design process, 
users can have a say into how this systems would work, be controlled and be tailored 
to their needs. 
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Abstract. Research on developing context-aware applications faces the chal- 
lenge of collecting, integrating and combining information in a wide range of 
formats from a variety of sources, such as desktop applications, web services, 
user profiles and databases. On the other hand, the need for pervasive real-time 
application integration has boosted the development of Integration Platforms in 
enteiprise environments. These platforms provide an infrastructure of services 
that enable dozens of applications spread around the organization to recognize 
and respond to the presence of information. In this paper we propose that inte- 
gration frameworks can be used to facilitate the development of context-aware 
applications. As a proof of concept, we developed a simple context-aware ap- 
plication that works on top of a well-known integration framework. 



1 Introduction 

Research on ubiquitous computing points to a future in which computers, applications 
and networks will be part of the infrastructure and, just like the electric and telephone 
networks these days, will be almost invisible [17]. In this environment, Context- 
Aware Applications (CAA) will use information from sources spread all over the 
place, such as sensors, databases and Web Services providing real-time data. All 
theses systems will cooperate with each other on top of an ad-hoc ubiquitous infra- 
structure. 

However, is has been extremely difficult to find a generic but working solution to the 
problem of collecting all this information in such an open and dynamic environment. 
Two important issues in the CAA research area are the central role of mobile applica- 
tions (including the limited resources of mobile devices) and the need for real-time 
data sharing in order to support these mobile applications. 

On the other side, the need to seamlessly exchange information amongst an increas- 
ing number of applications is an old problem in large service organizations, such as 
banks, insurance companies and mobile telecom operators. This need gave rise to the 
development of so-called “Integration Frameworks” [9] (also known as integration 
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servers or message brokers) that provide an infrastructure to enable dozens of appli- 
cations exchange information. Furthermore, this exchange takes place in real-time 
and automatically, without the need for human interference. Examples of integration 
frameworks include well-known products from WebMethods [15], Microsoft (Biz- 
talk) [10] and IBM (Websphere Integration Server) [16]. 

In this paper we propose to use an integration framework as the basic infrastructure 
for developing CAA. This architecture proposal was validated by developing a mo- 
bile CAA prototype that works on top of a well known integration framework. 

The paper is organized as follows. Section 2 presents the current challenges in the 
CAA research area, Section 3 highlights some related work (in particular, infrastruc- 
ture proposals) and Section 4 introduces integration frameworks. Our proposal is then 
presented in Section 5. Section 6 describes the prototype we developed to validate the 
proposal while in Section 7 we evaluate the prototype. Finally, Section 8 summarizes 
the main contributions of the paper and presents our planned future work. 



2 Challenges in Context- A wareness Research 

The overall objective on context-awareness research is to build and operate context- 
enabled environments in which multiple sources and consumers (i.e. applications) of 
context collaborate in exchanging information (about context) using an infrastructure. 

In such an open and dynamic environment, a source may provide information to be 
used by none, one or multiple interested applications. At the same time, these appli- 
cations must be able to obtain information from a variety of sources, eventually 
choosing the best fit (cheapest, fastest, etc.) among them. As a result, a number of 
context-aware applications (CAA) and prototypes have been developed in the past 
few years and some examples will be presented in Section 3. 

However, most of these CAA were built for specific purposes and did not pay much 
importance to generic issues, such as: 

• Mobile applications will take a lead role in context-aware environments 

- although mobile applications are nowadays easy to build, mobile devices 
are limited on resources and (usually) disconnected from the Internet. These 
limitations suggest the development of simple mobile applications that will 
use real-time data and powerful functionalities provided by the infrastruc- 
ture; 

• There will be many sources of context information - although databases 
and Web Services can be used to access remote sources providing context 
information, there will be so many sources that CAA should instead rely on 
a single generic infrastructure that will provide all context information in- 
stantaneously in a standard protocol and format. 
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Context-aware applications (CAA) need information about context in a variety of 
formats retrieved from a large number of sources, including machines, databases, 
applications, Web Services, LDAP profiles, and so on. Application integration (as the 
basis for sharing information) is already difficult and Section 4.1 will explain the 
main challenges in this area. However, because CAA are small mobile applications 
that rely on a large number of autonomous sources, application integration is particu- 
larly difficult in the CAA research area. 

These difficulties suggest an architecture in which CAA will rely on services pro- 
vided by a common context-aware infrastructure that is, by itself, the enablement of 
the context-aware environment. 



3 Related Work 

Research in context-awareness tries to produce working implementation for a generic 
infrastructure to access all sources of context information. In this Section we review 
some proposed architectures that are useful mainly because they enumerate desirable 
components for a context-aware infrastructure. 



3.1 Four-Layered Architecture 

Schmidt proposed a four-layered architecture [12] (see Figure 1) for the acquisition of 
context information. 
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Fig. 1. A four-layered architecture for context sensing (taken from [12]) 

Although this proposal is based on tightly coupled applications embedded into mobile 
devices, the layered architecture suggests a generic approach in which a network of 
sensors provides information to middleware context layers that combine or adjust the 
information needed for the mobile applications. 
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3.2 Context Toolkit 

Dey proposed the Context Toolkit [4] as a tool to facilitate the quick development of 
CAA. The toolkit is defined by three abstractions that form a coherent set of compo- 
nents for developing context-aware applications: 

• Context widgets - are responsible for acquiring a specific type of context 
and make it available to CAA in a generic manner, no matter how it was ac- 
tually acquired; 

• Context interpreters - compose context based on other context (for exam- 
ple, to convert a name to an e-mail address); 

• Context aggregators - collect all context information about an entity (for 
example, a person) and behave as proxies for applications. 

These abstractions are very interesting because they can be used for integrating in- 
formation about context that might come from a number of sources and help stan- 
dardize distinct information in order to be available in a generic format. 



3.3 Infrastructures 

In order to support the development of CAA, some researchers have proposed so- 
called “services infrastructures” [6, 8] that provide context to this kind of applica- 
tions. This approach places the burden of most of the common tasks performed in a 
context-aware environment (context services discovery, context-sensing and proc- 
essing, events notification, and so on) in a common shared infrastructure. 

These researchers argue that an infrastructure approach has several advantages [8]: 

• Independence from hardware, operating system and platform - achieved 
by using standard data formats and network protocols; 

• Improved capabilities for maintenance and evolution - because new ele- 
ments can be added or replaced transparently, without affecting applications 
or other components; 

• Sharing of sensors, processing power, data and services - so that many 
CAA can use the same infrastructure and even take advantage of existing in- 
frastructures. 



3.4 Broker 

Chen proposes that context-aware applications should be integrated with specialized 
components (called brokers) that will be spread in the environment. Also, these com- 
ponents will need to work together in order to provide context information extracted 
from their domain. 
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Chen introduced the notion of “context domains” [2] by proposing an architecture in 
which brokers act on behalf of applications. These brokers (also called agents) are 
specialized in some context domain and provide context about that domain. 



4 Integration Frameworks 

The difficulty of integrating dozens of applications is an old problem in large organi- 
zations so, as a consequence, a number of solutions have been proposed [9,14]. Re- 
cently, most IT vendors have converged on so-called Integration Frameworks that can 
be considered a de facto standard architecture. Examples of products that follow this 
architecture are Websphere Integration Server from IBM and Biztalk from Microsoft. 
WebMethods is a vendor specialized in this area and their products were used in the 
prototype described in Section 6. 

In a typical scenario - large service organization with dozens of applications and 
databases - an integration framework provides the backbone in which applications 
can publish information they produce and subscribe to the information they consume. 
Using this asynchronous publish-subscribe model, organizations can create a “global 
knowledge” of every important event that occurred in all applications, databases or 
other IT components. 

The most important services provided most integration frameworks are: 

• Support for most network protocols - from basic TCP/IP, HTTP and 
SMTP to proprietary MSMQ and MQSeries, including JMS and other stan- 
dard high-level protocols such as ebXML and RosettaNet; 

• Easy data transformation - data are provided in format Fa by application 
A but can be easily converted to format Fb for application B and to format 
Fc for application B; 

• Rules - can be applied after receiving data from an application A before 
sending that data to application B or C; for example, conditional transforma- 
tions might be performed and some data can be deleted for security reasons; 

• Publish and subscribe - applications in general can publish many different 
types of information but other applications can subscribe only to the types of 
information they need; 

• Notifications - are sent to applications is some event occurs. 

A typical integration framework is based on the following core components (see 
Figure 2 below): 

• Message brokers - the actual backbone of the architecture responsible for 
efficiently routing messages from publishers to subscribers; 

• Adapters - components responsible for exchanging information between 
message brokers and resources (databases, applications, Web Services, and 
so on) and vice-versa; 

• Application server - support the execution of business logic, data transfor- 
mation and complex business rules required for specific purposes. 
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Fulfillment Adapter 




Fig. 2. Architecture of an Integration Framework 



Most integration frameworks have adapters for integration technologies such as 
COM, ODBC, JDBC, JMS and Web Services as well as popular enterprise applica- 
tions such as SAP, Siebel and PeopleSoft. The application server works as a regular 
subscriber and publisher of information. Other components are available for config- 
uring and managing the framework as well as designing, executing and monitoring 
human workflows and business processes. 



5 Proposal 

Most presented proposals agree that CAA will rely on distributed components that 
need to be integrated in a services infrastructure. As can be seen, the problem is 
closely related to the application integration problem as described in Section 4 and 
maybe they have a similar solution. In fact, some proposals - such as the Context 
Toolkit [4] - are very similar to the architecture of a typical integration framework 
already used for connecting dozens of application in many large companies. 

We propose to use an integration framework for developing CAA. Most integration 
frameworks already provide the generic advantages (detailed in Section 3.3) required 
from any infrastructure approach for developing CAA. At the same time, an integra- 
tion framework already offers a working solution for publishing, subscribing and 
routing information about context generated by dozens of applications. Finally, it 
should be easy to send information from an integration framework to a mobile appli- 
cation. 
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As presented before, using an integration framework any application can provide 
context to any other application just by publishing and subscribing information. An 
adapter can be thought as an implementation of Dey’s “context widget” [4] because it 
is a component belonging to the framework that is responsible for integrating with a 
resource. All applications that subscribe to some type of information are notified 
when new information of that type is published. 

Adapters can be developed to retrieve context from any kind of resource (sensors, 
machines, applications, databases, etc.) and publish this information to a message 
broker. The message broker will then route this information to any CAA that sub- 
scribed to this type of context. Finally, the application server can act as a “context 
interpreter” or “aggregator” collecting, combining and transforming information 
about context. In summary, integration frameworks can be used for supporting the 
development of CAA without any major change. 

There are some other reasons in support of integration frameworks: 

• Most large service organizations already have an integration framework; 

• An integration platform can be used for integrating and developing many 
kinds of applications, not only CAA; 

• Many kinds of context that will be subscribed by CAA are straightforward 
business information that may be already available; 

• The other services offered by most integration platforms (data transforma- 
tion, rules etc.) can also be used to facilitate the task of developing CAA. 



6 Prototype 

In order to validate the proposal we have implemented a prototype CAA using a 
commercial integration framework called WebMethods. We chose as case study an 
example mobile application provided by a mobile telecom operator. 



6.1 Case Study 

Integration platforms are very popular in the telecom industry because in these or- 
ganizations there are dozens of applications that need to be integrated in order to 
support most business processes. 

For example, a new customer created in the CRM application triggers the event “new 
customer” that is likely to be subscribed by the network provisioning application (in 
order to install the services requested by the customer) and the billing package (in 
order to start charging the customer for the services). Once the “new customer” event 
is triggered, any (existing or future) application can subscribe to that event and start 
obtaining information about new customers. 
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On the other hand, the network infrastructure provides an interesting mixture of con- 
text information, including the user address, real-time localization, all past phone 
calls, and so on. This context can be used for a variety of purposes, from focused 
marketing to developing CAA. 



6.2 Push-To-Talk Prototype 

The Push-To-Talk (PTT) concept enables mobile phones to connect directly to each 
other, in fact using mobile phones as walkie-talkies. Although the PTT concept can 
be implemented using many different technologies, for business reasons the mobile 
phones (two or more) should be near' by, typically connected to the same antenna. 

That means a user can only contact through PPT some of the contacts in the phone 
book and those contacts change dynamically over time depending on which contacts 
are near by. So it is useful to know whether a contact is near by before attempting to 
use PTT with that contact. Even better, it would be nice to have a list of “near by 
contacts’’ that the user can call without initiating a regular phone call. (We are as- 
suming PTT is faster and cheaper than regular calls.) 

The PTT prototype that was implemented is basically a contact manager that allows a 
user to insert and remove contacts. The location and availability contexts are updated 
automatically based on information sent by the telecom operator. This is an excellent 
example of a CAA because it includes mobility, user interface and context informa- 
tion that changes dynamically depending on user preferences. 

The application must have access to two types of context: contacts location and con- 
tacts availability. These two types of context come from different sources. Location 
comes from existing applications in the telecom operator that know the location of 
mobile users while availability information (a user is connected and accepts to be 
called) comes from the PPT application on each user’s mobile phone. 



6.3 Prototype Architecture 

The prototype architecture is based on the integration platform that already exists in 
the telecom operator that provides the case study. We opted for WebMethods that is 
effectively used by that telecom operator but, since we could not test the prototype in 
the real working environment, we used the same product in the laboratory. 

The architecture for developing the Push-To-Talk prototype (depicted in Figure 3) 
has the following components: 

• Buddies client application - the contact manager application that runs on 
the mobile phones and supports the desired PTT functionality; 

• Buddies server application - the application that stores subscribers infor- 
mation, contact lists, availability and location of all users; 
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• Web services - a set of Web Services provided by the telecom operator to 
provide information about users; 

• Movement simulator - simulates movements across coverage areas and 
availability status changes; 

• Integration framework - the set of components (see Section 4) that provide 
the services infrastructure for exchanging all messages between applications. 




Although there is no limitation for the variety of context information to be published 
to the integration infrastructure, our prototype deals only with two types of context: 

• Subscriber location - provided by the telecom network services whenever 
the user changes antenna; 

• Subscriber availability - provided by each subscriber whenever they 
change status. 

It is important to note that these contexts are not provided directly or exclusively to 
the PPT prototype; they are provided to the integration framework that in turn sends 
the context to all interested applications. In a real telecom operator this context would 
be available for other applications, or maybe it would be already available, so build- 
ing this CAA would be extremely easy. 

The Buddies Client Application is context-aware in the sense that subscribers carry 
this application on their mobile phones and it allows a user to maintain a list of con- 
tacts, adding or removing contacts (name plus number) from that list. The contact list 
is sensitive to context provided by the telecom’s services infrastructure. The GUI 
shows distinct colored icons that indicates contacts availability status. Users can also 
change their availability status by choosing the “I’m Busy” option in the menu. In this 
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case, a notification of status change is sent to the integration framework that routes 
that notification to all interested applications. 

The Buddies Server Application updates information for this subscriber and all other 
subscribers who have this number in their contact list. This application is also respon- 
sible for maintaining context about all subscribers and thus provides all the informa- 
tion needed for the Buddies Client Applications. 

In order to evaluate the prototype we needed a tool to simulate changes of location 
and status. This tool is called the movement simulator and was fundamental to view 
the real-time recognition and reaction of the prototype to context changes. Since this 
tool is not important for the paper, it will not be described here. 



7 Evaluation 

In our experiment, using an Integration Framework has provided not only the features 
proposed in related research work - such as layered approach, context acquisition, 
transformation and aggregation - but also a entire new set of desirable infrastructure 
services, such as publishing, subscribing and notification services. These services 
instantly support the creation of environmental awareness by simply allowing appli- 
cations to publish information to a common shared infrastructure. 



7.1 Advantages 

The experiment also demonstrated that our proposal has many other advantages: 

• Reduced development time - we were able to design and build a complete 
CAA in a very short period of time using the components provided by the 
integration framework; 

• Creation of a context-aware infrastructure - a set of context information, 
such as cell change and status change, are made available in the infrastruc- 
ture to any other application that can use such information. This creates 
enormous synergies with other applications and drastically reduces the cost 
of developing new CAA; 

• Leveraging existing investments - this development shows those integra- 
tion frameworks commercially available and already being used in many 
businesses environments are context-aware infra-structures and already in- 
clude a great set of components for developing CAA; 

• Enabling light solutions - placing the burden of performing specific tasks 
into the infrastructure (such as collecting context information, interfacing 
with a number of applications, transforming information, reasoning about 
context, among others) frees the CAA from dealing with all these tasks, 
making them much more suitable for the mobile devices supposed to carry 
them. 
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7.2 Limitations 

The prototype was implemented on top of the .Net Compact Framework with Smart- 
Phone Emulator. This development framework is a limited implementation of the full 
.Net Framework due to storing limitation on mobile phones and other restrictions. For 
example, a few workarounds had to be made in order to handle XML documents due 
to the lack of XML Serialization classes. In the future, we expect that developing 
applications for mobile phones and PDA will be much easier. 

In addition, the Buddies Client Application should be notified immediately when any 
change of context (availability or location) occurs to any subscriber in its contact list. 
Because of existing limitations in the .Net Compact Framework, the application has to 
periodically ask the Buddies Server Application if there are any changes. We hope 
this limitation will be solved in the next version of the .Net Compact framework. 

None of these limitations had anything to do with the Integration Framework and will 
probably be solved by Microsoft in the near future. 



8 Conclusion 

Challenges for building an infrastructure for context-aware applications and enabling 
the ubiquitous context environment can be described in terms of desired features [6] : 

1 . Dynamic discovery of software components and information; 

2. Dynamic interconnection of components; 

3. Sensing, interpretation and dissemination of context; 

4. Mobility and adaptation of components; 

5. Rapid development and deployment of large numbers of software compo- 
nents and user interfaces. 

We would like to add “context type schema services” to provide dynamic recognition 
of context services that could provide the context established in the schema defini- 
tion. 

Based on our experiment we can conclude that, in fact, an integration framework 
(such as WebMethods) can enormously facilitate the development of context-aware 
applications such as the prototype we built. These frameworks can support some of 
the features listed above, such as those found in 3, 4 and 5. However, since these 
frameworks were originally designed to work in a completely controlled environ- 
ment, features 1 and 2 are usually left out. 

In the future we would like to continue our research towards using an integration 
framework for supporting all six features that, in our opinion, are necessary to build 
these ubiquitous applications. This research will always be based on prototypes that 
demonstrate the feasibility of our proposals. 
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Abstract. In this article we identify the general communication patterns of phys- 
ical devices and define all interfaces and conflict resolution strategies that are 
present in any ensemble. Based on this approach we define a generic component 
topology applicable for Ambient Intelligent scenarios. We validate our interface 
framework by showing the application of two smart scenarios: the smart confer- 
ence room and the smart living room and the possibility to move applications 
and devices from one ensemble to the other. Our approach guarantees the fully 
extensibility of device ensembles with new devices. 



1 Introduction 

The vision of Ambient Intelligence [1,3,16] is based on the ubiquity of information 
technology, the presence of computation, communication, and sensorial capabilities in 
an unlimited abundance of everyday appliances and environments. A rather popular 
scenario illustrating this vision is the smart conference room (or smart living room, for 
consumer-oriented projects, see Figure 1) that automatically adapts to the activities of its 
current occupants (cf. e.g. [5,12,17,20]). Such a room might, for instance, automatically 
switch the projector to the current lecturer’s presentation as she approaches the speaker’s 
desk 1 , and subdue the room lights — turning them up again for the discussion. Of course, 
we expect the environment to automatically fetch the presentation from the lecturer’s 
notebook. 

Such a scenario doesn’t sound too difficult, it can readily be constructed from com- 
mon hardware available today, and, using pressure sensors and RFID tagging, doesn’t 
even require expensive cameras and difficult image analysis to detect who is currently 
at the speaker’s desk. Setting up the application software for this scenario that drives the 
environment’s devices in response to sensor signals doesn’t present a major hurdle too. 
So it seems as if Ambient Intelligence is rather well understood, as far as information 
technology is concerned. Details like image and speech recognition, as well as natural 
dialogues, of course need further research, but building smart environments from com- 
ponents is technologically straightforward, once we understand what kind of proactivity 
users will expect and accept. 

But only as long as the device ensembles that make up the environment are anticipated 
by the developers. Today’s smart environments in the various research labs are usually 

* This work has been partially supported by the German Federal Ministry of Education and 
Research under the grant signature BMB-F No. FKZ 01 ISC 27 A. 

1 For the smart living room this reads: \switch the TV set to the user's favorite show, as he takes 
seat on the sofa." 

P. Markopoulos et al. (Eds.): EUSAI 2004, LNCS 3295, pp. 1 12-123, 2004. 
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Fig-1 . Left: A typical conference room with a multiplicity of devices, e.g. projectors, microphones, 
speakers or different lights. Right: A modern living room with typical entertainment devices, e.g. 
TV set or surround sound devices. 



built from devices and components whose functionality is known to the developer. So, 
all possible interactions between devices can be considered in advance and suitable 
adaptation strategies for coping with changing ensembles can be defined. When looking 
at the underlying software infrastructure, we see that the interaction between the different 
devices, the intelligence", has been carefully handcrafted by the software engineers, 
which have built this scenario. This means: significant (i.e. unforeseen) changes of the 
ensemble require a manual modification of the smart environment’s control application. 

This is obviously out of the question for real world applications, where people con- 
tinuously buy new devices for embellishing their home. And it is a severe cost factor for 
institutional operators of professional media infrastructures such as conference rooms 
and smart offices. Things can be even more challenging: imagine a typical ad hoc meet- 
ing, where some people meet at a perfectly average room. All attendants bring notebook 
computers, at least one brings a beamer, and the room has some light controls. Of 
course, all devices will be accessible by wireless networks. So it would be possible for 
this chance ensemble to provide the same assistance as the deliberate smart conference 
room above. Enabling this kind of Ambient Intelligence, the ability of devices to con- 
figure themselves into a coherently acting ensemble, requires more than setting up a 
control application in advance. Here, we need software infrastructures that allow a true 
self-organization of ad-hoc appliance ensembles, with the ability to afford non-trivial 
changes to the ensemble. (See also [19] for a similar viewpoint on this topic.) 

Besides providing the middleware facilities for service discovery and communica- 
tion, such a software infrastructure also has to identify the set of fundamental interfaces 
that characterize the standard event processing topology to be followed in all possible en- 
sembles. This standard topology is the foundation for an appliance to be able to smoothly 
integrate itself into different ensembles: In a conference room, the user’s notebook may 
automatically connect to a beamer and deliver the user’s presentation, while it will hook 
up to the hi-fi system and deliver an MP3 playlist when arriving back home. 

In this paper, we will propose such a framework of standard interfaces that has 
emerged during the projects Embassi [10] and DynAMITE [4]. While we have dealt 
with the underlying middleware for supporting such interface frameworks in previous 
publications [7,8,9], we will here focus on the specific interfaces we have identified as 
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mandatory for appliance ensembles, and their applicability across different application 
scenarios. The structure of this paper is as follows: In Section 2 we give an overview of 
our software infrastructure that supports self-organizing ensembles. Section 3 describes 
the basic data flow principles of application ensembles and introduces a generic appliance 
topology that uses different conflict resolution strategies. The strategies are explained 
in Section 3.2. The realization of the scenarios is illustrated in Section 4. Based on this, 
we compare our approach with other activities in Section 5 and outline our next steps in 
Section 6. 

2 A Self-Organizing Middleware 

In this section, we briefly introduce the salient properties of the middleware we have 
developed for supporting self-organizing appliance ensembles. For more detailed de- 
scriptions, see [7,8,9], 

2.1 Appliances and Event Processing Pipelines 

When developing at a middleware concept, it is important to look at the communication 
patterns of the objects that are to be supported by this middleware. For smart environ- 
ments, we need to look at physical devices, which have at least one connection to the 
physical environment they are placed in: they observe user input, or they are able to 
change the environment (e.g. by increasing the light level, by rendering a medium, etc.), 
or both. When looking at the event processing in such devices, we may observe a spe- 
cific event processing pipeline, as outlined in Figure 2: Devices have a User Interface 
component that translates physical user interactions to events, the Control Application 
is responsible for determining the appropriate action to be performed in response to 
this event, and finally the Actuators are physically executing these actions. It seems rea- 
sonable to assume that all devices employ a similar event processing pipeline (even if 
certain stages are implemented trivially, being just a wire connecting the switch to the 
light bulb). 

It would then be interesting to extend the interfaces between the individual process- 
ing stages across multiple devices, as outlined in the right side of Figure 2. This would 
allow a dialogue component of one device to see the input events of other devices, or 
it would enable a particularly clever control application to drive the actuators provided 
by other devices. By turning the private interfaces between the processing stages in a 
device into public channels , we observe that the event processing pipeline is imple- 
mented cooperatively by the device ensemble on a per-stage level. Each pipeline stage 
is realized through the cooperation of the respective local functionalities contributed by 
the members of the current ensemble. 

So, the underlying approach of our proposal for a middleware is to develop a con- 
cept that provides the essential communication patterns of such data-flow based multi- 
component architectures. Note that the channels outlined in Figure 2 are not the complete 
story. Much more elaborate data processing pipelines can easily be developed (such as 
outlined in [ 10]). Therefore, the point of our middleware concept is not to fix a specific 
data flow topology, but rather to allow arbitrary such topologies to be created ad hoc 
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Fig. 2. Devices and Data Flows 



from the components provided by the devices in an ensemble. The model we have devel- 
oped so far is called SodaPop (for: Self-Organizing Data-flow Architectures supporting 
Ontology-based problem decomposition). 

2.2 Core Concepts of SodaPop 

SodaPop provides the capability to create channels — message busses — on demand. On 
a given SodaPop channel, messages are delivered between communication partners 
based on a refined publish / subscribe concept. Every channel may be equipped with an 
individual strategy for resolving conflicts that may arise between subscribers competing 
for the same message (the same request). 

Once a component requests a channel for communication, a check is performed 
to see whether this channel already exists in the ensemble. If this is the case, the new 
component is attached to this channel. Otherwise, a new channel is created. Through this 
mechanism of dynamically creating and binding to channels, event processing pipelines 
emerge automatically, as soon as suitable components meet. 

When subscribing to a channel, an event consumer declares: 

- the set of messages it is able to process, 

- how well it is suited for processing a certain message, 

- whether it is able to run in parallel to other message consumers on the same message, 

- wether it is able to cooperate with other consumers in processing the message. 

These aspects are described by the subscribing consumer’s utility. A utility is a func- 
tion that maps a message to a utility value, which encodes the subscribers’ handling 
capabilities for the specific message. 

When a channel processes a message, it evaluates the subscribing consumers’ han- 
dling capabilities and then decides, which consumers will effectively receive the mes- 
sage. Also, the channel may decide to decompose the message into multiple (presumably 
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Fig. 3. Basic message handling process in SodaPop 



simpler) messages, which can be handled better by the subscribing consumers. (Obvi- 
ously, the consumers then solve the original message in cooperation.) The basic process 
of message handling is shown in Figure 3. 

How a channel determines the effective message decomposition and how it chooses 
the set of receiving consumers is defined by the channel’s decomposition strategy. 

Both the transducers’ utility and the channel’s strategy are eventually based on the 
channel’s ontology - the semantics of the messages that are communicated across the 
channel. 

To summarize, self-organization is achieved by two means in SodaPop: 

1 . Identifying the set of channels that completely cover the essential message process- 
ing behavior for any appliance in the prospective application domain. 

2. Developing suitable channel strategies that effectively provide a distributed coordi- 
nation mechanism tailored to the functionality, which is anticipated for the listening 
components. 

Then any device is able to integrate itself autonomously into an ensemble, and any set 
of devices can spontaneously form an ensemble. 

Following, we describe how the SodaPop infrastructure is used for the set up of a 
generic topology for Ambient Intelligence. 
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3 A Generic Topology for Ambient Intelligence 

3.1 The Set of Channels 

In order to develop a generic topology for Ambient Intelligence that allows the com- 
munication and self-organization of physical devices, we have to analyse the different 
communication patterns of the devices while interacting as an ensemble. Our example 
scenarios are the smart conference room and the smart living room as outlined in Sec- 
tion 1 . Clearly, both kinds of environment have to react to events that are — eventually — 
caused by the user’s behavior. There may be explicit user interactions, as well as implicit 
interactions, such as the user standing up and walking to the speaker stand. So, we can 
identify a physical level of user interface components resp. sensor components that map 
user interactions resp. environment changes into atomic events (see the first level of 
components in Figure 4). These atomic events (or sequences of them) have to be trans- 
lated into goals, which the ensemble should try to achieve in response to the observed 
behavior. Pressing a light switch should be interpreted as the wish of the user to change 
the actual state of the associated lights. Likewise, the sequence of events (1) person 
rising from a chair and (2) arriving at the speakers desk, should be interpreted as the 
wish to give a lecture (as an example for many other (sequences of) events that should 
effect an \intelligent" behaviour of a smart room). These interpretations are done by a 
level of parser components that interpret events (or sequences of events) into certain 
(user) goals. Every physical device inevitably contains some kind of parser component. 
Already todays TV sets receive the infrared signals of a remote control, interpret the 
commands that are encoded infrared light — that means interpret the user’s interaction 
goal — and execute it (e.g. switch on / off). As our example scenarios illustrate, a goal 
could be more abstract than simple commands that indicates the switching of states, e.g. 
the preparation of a conference room for a lecture. Also the user’s goal to hear music 
(the interpreted goal, after she double clicked an mp3-file) does not specify the physical 
devices that now have to be manipulated. Thus we define a third level of components, 
which is responsible for mapping goals into concrete function calls. This assistant com- 
ponents map those abstract goals into concrete function calls that are executed by the 
following actor components. Some parser or assistant components could be very simple, 
e.g. for a light, where they are basically a simple wire. But in general they may be rather 
complex, as the smart conference room example demonstrates. 

Turning the private interfaces between the processing stages in a device into public 
channels (see Section 2), it is possible to achieve an architectural integration. Figure 

4 illustrates the proposed generic topology for ambient intelligence, dividing physical 
devices into four component levels that are consequently connected by three channels: 

- the event-channel that allocates the events to the different parser components 

- the goals-channel that passes the constructed goal from the parsing level to the most 
appropriate assistant component 

- the operations-channel that passes concrete function calls to the actors. 
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Fig. 4. Generic Topology for Ambient Intelligence. The amount of different components on each 
level is unlimited. 



3.2 The Channel-Strategies 

In order to guarantee dynamic workflow and extensibility of the device ensemble, we 
assigned conflict resolution strategies to each channel. The strategy requirements of the 
different channels are the following: 

- the event-channel has to resolve between competing interpretations of (sequences 
of) sensor readings and interaction events: event interpretation strategy 

- the goal-channel has to deliver tasks to the most appropriate assistant component: 
opinion based agent selection algorithm 

- and the operations-channel has to find out an appropriate actor for a specific function 
call resp. a combination of actors: first match strategy (like prolog that chooses the 
first component that matches the task conditions). 

The event interpretation strategy acts upon the following principles: 

- parser components that might interpret an event (or sequences of events) can lock 
those events by giving an interpretation function to the channel 

- the event interpretation strategy analyses the different interpretation functions: 

• in case the interpretation functions are compatible, the channel allows both 
parsers to interpret and to publish goals 

• in case the interpreation functions (and thus the interpreted goals) are not com- 
patible, the parser with the longer parse wins, because it provides a more detailed 
view on its environment (and takes more environment variables into account 
then the others). 
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To give an example: Assume a parser a that interprets the event sequence (r; s) to the 
goal switch off lights and a parser b that interprets the event r to the goal switch on 
lights. According to the principles given above, parser a provide a better parse, in case 
the event sequence is completed. Otherwise parser b is allowed to continue. But if b 
interprets event r to turn all devices off, the event interpretation strategy allows both 
parsers to construct their goals, because the interpretations are compatible. 

The opinion based agent selection algorithm is used by the goals-channel. Once 
a parser component emits a goal into the goals-channel, the most appropriate assistant 
has to be found. But how can a channel provide this functionality in a distributed and 
dynamically changing ensemble? We chose an approach that uses the assistants’ opinions 
(the assistants are the available domain experts) in a suitable way. Or in other words: 
We calculate the objective ability of each assistant by using the assitants’ subjective 
opinions. Thus every assistant that takes part in a request to tender for accomplishing a 
goal, provides the channel with several aspects it considers as relevant to solve it. For 
each aspect, every assistant provides the channel with the following values: 

- the relative importance of each aspect 

- a confidence value for each aspect describing the confidence of the component that 
the aspect indeed has the assigned importance 

- a fidelity value that describes, how well the component thinks it can consider this 
aspect or adjust it to the ideal value 

This values are used to calculate effective objective importances of each raised aspect. 
Multiplied with the individual fidelity values the objective performance can be estimated 
(please note that the mathematical algorithms behind are not in the scope of this article). 
The goal was to prepare the lecture room for a speech (in continuation of our example). 
Imaging two assistant components raising their aspects they could take into account 
to develop an effective strategy to reach this goal. The first assistant would raise the 
aspects presentation, with a high importance and fidelity value and lightings, with a 
low importance and low fidelity value whereas the second assistant would raise the 
aspects presentation, with almost the same value of importance and fidelty than the first 
one, but additionally the aspects lightings and sound reproduction with high values of 
importance and fidelity. According to our algorithm - and according to rationality - the 
second assistant would win. 

4 Realisation 

To investigate the viability of the interface framework, we need to show that we can 
build different smart environments on this framework and that we then can freely move 
appliances between these environments. We built up a smart conference room, consisting 
of different lights, seats with pressure sensors, a pressure sensor at the speaker’s desk, 
a microphone and a projector (with presenter PC). Figure 5 illustrates the component 
topology. We implemented a parser component that translates sequences of events (e.g. 
locking a notebook 2 , rising from a chair and arriving the speaker’s desk) into the goal 

2 Here, we implemented a simple awareness software that emits an event, containing the IP 
number of the notebook if the user presses a special shortcut 
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Fig. 5. The appliance ensemble of our smart conference room consisting of pressure sensors, 
projector, a personal notebook, different room lights and a microphone. 



to prepare the room for a presentation from PC with the IP number x and publishes 
it on the goal channel (see (1) and (2) in Figure 5). Given this goal, the assistant (see 
(3) in Figure 5) constructs the following action sequence: switch on the beamer, get 
presentation from PC with IP number x, switch speaker stand lights on, darken room 
lights, switch on microphone and loudspeakers (see (4) in Figure 5). After moving the 
notebook from the smart conference room to the smart living room (see the component 
topology in Figure 6) we chose an mp3-file for playing. The notebook emits the events 
to play containing the name of the file and the IP number (see (1) in Figure 6). This 
event is tranfered into the corresponding goal to play the mp3-file. The assistant ((3) in 
Figure 6) creates the action sequence to play the file from the notebook by using the hi-fi 
system. 

5 Related Work 

There are other approaches that address the problem of dynamic, self organizing sys- 
tems, such as for example HAVi [6], Jini [13], the Galaxy Communicator Architecture 
[14,15], or SRI’s Open Agent Architecture (OAA) [11], Especially Galaxy and OAA 
provide architectures for multi-agent systems. Also the pattern-matching approach in 
SodaPop ist not new. Comparable concepts can be found in Galaxy, in the OAA, as well 
as in earlier works on Prolog or in the Pattern-Matching Lambda Calculus. Here the 
SodaPop approach provides a certain refinement at the conceptual level by replacing 
language-specific syntactic pattern-matching functionality (cp. the Prolog-based pattern 
matching of OAA) by a language-independent facility based on utility value computation 
functions that are provided by transducers. SodaPop introduces important differences 
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Fig. 6. The communication flow in our smart living room realisation. The notebook components 
are highlighted dark-gray to point out the dynamic extension of the device ensemble. 



to the other approaches by using a two-stage approach to system decompositon and self 
organization. Coarse-grained structuring is provided by defining channels, fine grained 
structure is supported by pattern-matching. This approach makes the application of dif- 
ferent strategies working on different stages of the workflow possible. Two strategies, 
one for event interpretation and one for choosing the most appropriate component for 
a given task, were developed and applied to different channels. They are applicable to 
any application domain in contrast to the decompositon and recombination strategies of 
Galaxy and OAA. Galaxy provides a centralized hub-component, which uses routing 
rules for modeling how messages are transferred between the different system com- 
ponents, whereas OAA provides Prolog-based strategy mechanisms. Both approaches 
require a re-design of their rule bases in case they are extended by other components. 
Another disadvantage of Galaxy and OAA is the using of heavyweight routing compo- 
nents that incorporate arbitray memory. Consequently they are not suited for a distributed 
implementation. Other research initiatives address the intelligent control of users’ en- 
vironments. The Easy Living project [2,20] makes access to information and services 
possible (e.g. a personal presentation). Therefore Easy Living uses a centralized archi- 
tecture with two main components. A room server holds information about all devices 
and services, whereas a rule engine uses sensor informations to control the room devices. 
This approach bears resemblance to the Galaxy architecture by using a centralized rule 
engine. But the split-up of room information and rules into two components allows more 
flexibility. Nevertheless, the world model as well as the rules have to be extended (or 
exchanged) in case the device infrastructure changes. The Intelligent Classroom from 
Northwestern University [5,18] uses a declarative approach to make a classroom more 
intelligent. Therefore two rule systems were used. The first rule system interprets user 
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gestures and user utterances into user goals, whereas the second rule system infers the 
environments possible reactions. That is similar to our approach to a principle compo- 
nent topology as described in Section 3, where we differ between a parser level that 
constructs goals from sequences of events and the assistant level that infers operation 
calls for the actors. But instead of the rule systems approach of the Intelligent Classroom 
we choose a distributed solution, where the channels act like moderators and are able to 
choose the most appropriate interpretation that is offered by available transducers. 

To our experience it is disadvantageous to provide only a single granularity for de- 
composing a complex system structure. The single granularity necessarily has to be fine 
in order to provide the required flexibility. When trying to fix the overall structure of the 
system, such a fine granularity provides too much detail and quickly leads to a prolifera- 
tion of interfaces that are shared by only a few components. The proliferation of interfaces 
does not avail, because it obstructs the interoperability of system components — a prime 
goal of our work presented here. 



6 Conclusions and Future Work 



In this paper we defined a generic topology for Ambient Intelligence based on the 
SoDAPop-middleware concepts. The generic topology makes it possible to develop de- 
vice ensembles that behave intelligent and reasonable . Especially public rooms, like 
smart conference / lecture rooms should behave reasonable like many users would ex- 
pect (e.g. standing at the speakers desk means that someone wants to have a presentation). 
Also home entertainment devices should behave logically, that means in a manner most 
users would expect — in other cases the direct interaction is always possible. There- 
fore we identified the general communication patterns of physical devices and defined 
all communication channels and conflict resolution / decomposition strategies that are 
present in any device ensemble. Any device that offers components that are able to in- 
tegrate themselves into the generic topology presented here are able to cooperate with 
other devices within spontaneous self-organizing device ensembles. We demonstrated 
the applicability of our generic topology by implementing two scenarios in which we can 
freely move applications. The SodaPop infrastructure is currently implemented in Java. 
This implementation offers the application of different channel strategies (e.g. the event 
interpretation strategy and the opinion based selection algorithm). It is applied within 
the project DynAMITE[ 4] and is available from the project web site. This downloadable 
software offers an API to implement own channels and components as well as to apply 
the different channel strategies to them. 

But further work has to be done. Our current work includes the implementation of 
graphical interfaces to allow the development of fast and efficient experiments without 
the need to install real devices (Ambient Intelligence fast prototyping). Also Quality 
of Services (QoS) quarantess have to be defined. Currently we have no mechanism for 
making global statements about the set of components and channels. These statements 
could contain both constraints on the topology of the channel / components network as 
well as constraints on their temporal behaviour. 
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Abstract. Emerging technologies in building automation have the potential to 
increase the quality and cost effectiveness of services in the building industry. 
However, insufficient range of collected data and models of the physical and be- 
havioural aspects of the facilities limit the capabilities of building automation 
systems. We describe a project for improving building services by collecting com- 
prehensive data from variable sources and generating high-resolution models of 
buildings. In this context, location sensing is critical not only for data collection, 
but also for constructing models of buildings as dynamic environments. We first 
examine a range of existing location sensing technologies from the building au- 
tomation perspective. We then outline the implementation of a specific location 
sensing system together with respective test results. 



1 Introduction 

Building automation is expected to improve building performance by reducing the op- 
eration and maintenance costs of buildings (e.g. for heating, cooling, and lighting), im- 
proving environmental performance, augmenting human comfort, and providing higher 
safety levels. However, data collection and monitoring activities in current building au- 
tomation systems are rather limited: the focus is mostly on service systems such as 
elevators and office equipment. There is a lack of systematic and scalable approaches 
to comprehensive facility state monitoring throughout buildings’ life cycle. To achieve 
a higher level of building automation technology, collected data must cover not only the 
state of systems such as elevators, but also the state of room enclosure surfaces, furniture, 
doors, operable windows, and other static or dynamically changing building entities. To- 
ward this end, we focus on generating comprehensive and self-updating models of the 
physical and behavioural aspects of facilities over their life cycle [ 1 ,2, 3,4, 5]. Thereby, we 
are developing and implementing a prototype sensor-supported self-updating building 
model for simulation-based building operation support [4], 

To deliver a proof of concept for the feasibility of the system, we focus on lighting 
controls in a test space. The control scenario is as follows: at regular time intervals, 
an Executive Control Unit (ECU) considers possible changes in the states of control 
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devices (e.g. the dimming positions of the electrical light fixtures, the position of window 
shades). The ECU then requests a lighting simulation program to predict the implications 
of device states for the lighting performance of the space (i.e., light availability and 
distribution) in the immediate future time interval. Based on the comparison of the 
predicted performance levels with desired (user-based) objective functions, the ECU 
initiates the transition to the most desirable control state. For this scenario to work, the 
underlying model generation system must consider a wide range of space and system 
characteristics, including the spatial and material properties of a space as well as the 
state of the luminaries and furniture. Specifically, the lighting simulator requires an 
accurate and up-to-date model of both internal space and external conditions (i.e. the 
sky luminance pattern) to run the necessary simulations. This implies the need for a 
location sensing system to provide real-time identification and location data for the 
construction of a space model. This information is subsequently used by the ECU to 
construct a 3D object model in the system database. The resulting model can be used 
for lighting simulations. Similar models can be constructed to inform other applications 
for building operation and facility management support. 

The challenge in constructing a model is that the building infrastructure is not a 
static entity and may change in multiple ways during its life cycle. In office buildings, 
an indicator for these dynamics is churn, that is, the number of office moves during 
a given year. Depending on the flexibility of a building’s systems, churn can involve 
significant infrastructure changes. According to a recent study on churn, freestanding 
furniture changes daily to monthly, or modular partitions once a year [10]. The ability 
to track such changes automatically is necessary for the viability of simulation-based 
building control. In our prototype, this task is performed by a location sensing system. 

2 Location Sensing Technology Review 

2.1 Assessment Criteria 

Prior to the implementation of a location sensing system, the available technologies are 
examined from the building automation perspective, A suitable system must be capable to 
identify individual objects and return their locations. Furthermore, it should require min- 
imum maintenance, and be scalable to adapt itself to changes in a facility. In addition to 
these basic requirements, the most important evaluation criteria are accuracy (relatively 
small systematic variation in measurements), unobtrusiveness (minimal installation and 
maintenance necessary, no inconvenience or health hazards for occupants), cost (per 
square meter, per object), scalability (dozens to hundreds of items per room, thousands 
per building), reliability (accurate location information for long time intervals, under 
adverse conditions, in cluttered, changing indoor environments), identification capabil- 
ity (identification of individual objects, not just types of objects), temporal resolution - 
update rate (how fast can changes in an object’s location be detected and reported). 

Most currently available location systems use tags, small items affixed to the actual 
objects to be tracked. Location information is obtained by signal exchange between these 
tags and a sensor infrastructure (sensors, readers ). Even more so than in other ubicomp 
applications, building model applications call for rather small, long-lived tags that require 
no batteries or any other maintenance. Moreover, systems based on devices that obtain 
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or calculate position information internally (called localized location computation) are 
not meaningful in building model applications, unless the location information is fed 
back to the overall system. 

2.2 Available Technologies 

Electromagnetic and Radio Frequency. These include technologies based on the 
measurement of electromagnetic or radio frequency signals’ field strengths, distortion, 
time-of-flight or frequency. Their main advantage is that they can usally operate through 
obstacles, without requiring a line of sight between tags and sensors. However, the 
presence of metal objects and thick walls can have significant influence on operating 
range and location accuracy. Electromagnetic systems (such as Polhemus FASTRAK 
[6]) achieve very high accuracy and precision (mm range), but can only operate in 
relatively small, closed environments. They are also very expensive and sensitive to 
metallic objects, and often require cable connections between tags and sensors. 

A number of research prototypes and products are available for using existing RF 
infrastructure (such as Bluetooth or 802. 1 1 networks) to calculate position information, 
for instance Ekahau Positioning Engine [7], All these products are based on localized lo- 
cation computation, making them currently less suitable for building model applications. 
Systems based on RFID (radio frequency identification) tags are particularly interesting, 
but currently no mature commercial system with acceptable accuracy is available. Spo- 
tON [8], a research project, claims accuracy in the one meter range using off-the-shelf 
active tags. However, the available data can only support an accuracy of three meters. 
LANDMARC [9], a similar research system, aims to improve accuracy by installing a 
grid of reference tags throughout the area of interest. The location accuracy of the system 
is approximately the same as the granularity of the grid, which means that to achieve 
one meter accuracy, active (battery-powered) reference tags have to be placed in a grid 
with a unit length of one meter. 

The commercial product PinPoint [11] uses active tags, communicating with trans- 
ponders in the microwave frequency range. In indoor environments, the system requires 
considerable installation overhead. The system can achieve a resolution of 3 meters at 
best, and for further resolutions, is limited to generating only the existence information. 
A competing product, WhereNet [12], achieves similar performance. 

Ultrasound. Ultrasound-based systems typically consist of battery-powered tags or 
“badges” and a set of transponder stations communicating with them; position infor- 
mation is obtained by measuring time-of-flight of acoustic signals. Research prototypes 
include Bat [13] and Cricket [14]. A commercial product is available from Sonitor Tech- 
nologies [15]. Sonitor’s system can operate in two modes: room-based (containment) 
and 3D. In 3D mode, it requires eight receiver devices to be fixed in every room; for 
positioning, four of these must be in direct line of sight to the tag. A maximum of four 
tags can be tracked per room, with a claimed resolution of 2-3 centimeters. Although 
this resolution is sufficient for building model purposes, the poor scalability and the 
strict line-of-sight restriction make this technology impractical for use in real-world 
applications. 
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Optical / Vision-Based. Sensors stimulated by optical attributes are also used for loca- 
tion awareness. The main advantage of vision-based location awareness systems is that 
they do not require high-cost tags that need continuous maintenance; sometimes they 
do not require a tag at all. The non-tagged technologies utilize visual attributes of the 
objects and are based on computer vision methods that exploit the relationship of the 
brightness at a point (x,y) in the image with the depth (z) information of the surface 
under certain lighting or camera conditions. A prototypical realisation of the methods 
mentioned above is Easy Living [16], where the location of the occupants inside a room 
is detected together with their personal identification. EasyLiving successfully combines 
multi-sensor data. However, the system is limited to person location and identification. 
Moreover, the identification is based on color features only, which makes it difficult to 
distinguish people wearing clothes with similar colors. 

Another approach is the use of laser sensors to determine depth information. These 
systems comprise a transmitter unit where a laser beam is generated and emitted, and a 
receiver unit where the reflecting laser beam is captured. Depth information is extracted 
from the travel time of the laser beam. A more complex version of these sensors is 
the laser camera where the above process takes place for each picture element and 
consequently, forms a range map of the scene. CityScanner [17] performs a combination 
of this technology with digital cameras. This system is, however, designed primarily for 
outdoor use and generally too slow for building model applications. Constellation 3Di 
[18] is another product in the category of laser-based systems. It uses active tags that 
emit laser beams. It is a very accurate (sub-millimeter) laser system, however it is based 
on localized location computation, and therefore not further considered in this paper. 

The optical systems mentioned above are in contrast with the common “low-cost, 
low-maintenance’’ feature of vision-based location awareness systems because they re- 
quire active tags. Other vision-based systems use passive tags and utilize their visual 
attributes rather than the visual attributes of the objects themselves. These technolo- 
gies usually work in variations of the well-known ’’barcode reader” principle, scanning 
scenes for distinctive optical markers. Just as other optical active-tagged or non-tagged 
systems, they have the disadvantages of requiring line-of-sight between objects and sen- 
sors (cameras, scanners), and raising privacy concerns for occupants. However, when 
compared with non-tagged solutions, the main benefit of using tags is that they can be 
coded with an ID number that makes the identification of the individual objects possible. 

One example is Phoenix Technologies’ Visualeyez system that uses LED markers 
fixed to the tracked object. It achieves sub-millimeter accuracy and is able to track 
thousands of tags simultaneously. Its main disadvantages are high cost and the power 
consumption of the LED tags, limiting its usefulness for realistic office situations. There 
are also systems using simpler visual tags rather than power consuming LEDs. Shape 
features are used for identifying most of these tags where the area and number of holes, 
lines and plain regions in the tag determine its main shape features. A Mitsubishi Electric 
Research Laboratories research prototype [19] focuses on identification of trademark 
logos by using the shape definitions. In addition to shape-based methods, the contours 
of the visual tag are extracted and length and curvature features are also processed for 
identification. 
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Table 1 . Qualitative overview of location-sensing systems 





RFID 


ultrasound 


vision (tagged) 


precision 


3 m 


few cm 


few cm 


obtrusiveness 


medium 


medium 


high 


cost 


high 


high 


low 


scalability 


high 


low 


high 


reliability 


high 


low 


low 


identification 


yes 


yes 


yes 



TRIP (target recognition using image processing, [21]) is a particular example of 
the contour-based method. It works with circular black-and-white TRIPcode tags that 
can be generated with an ordinary laser printer. The contours of the tag are taken from 
camera images and then used to calculate position and identification with a number of 
image processing techniques. 



2.3 Summary 

Table 1 provides a qualitative comparison of the main location-sensing technologies 
considered above. In short, the findings are: 

- RF-based tagged systems — while promising due to their high scalability and relia- 
bility — are not accurate enough and require considerable infrastructure, as well as 
fairly expensive tags. 

- Ultrasound systems provide sufficient accuracy, but have serious shortcomings in 
scalability and reliability, as only a few tags per room can be tracked, and accurate 
positioning requires clear line of sight between tags and receivers. Cost, both in 
terms of infrastructure and tags, is comparable to RFID solutions. 

- Tagged vision-based systems require relatively simple and cheap infrastructure (dig- 
ital cameras) and very cheap tags (paper printouts). For real-world applications, 
full camera supervision of office spaces raises privacy concerns. However, the dis- 
cernible sensors of visual-based technologies generate less anxiety among privacy 
advocates than RF-based systems because of their stealthy nature. Vision-based 
systems require a clear line-of-sight that reduces their reliability. For experimental 
applications, though, such systems provide a useful solution and possess potential 
for adaptation to the real-world. 

It can be concluded that there is no perfect location system for self-updating building 
models today. Vision based methods appear as the most appropriate solutions that can 
form a basic infrastructure to our requirements because of being software-supported 
and open for modifications and improvements. The latest developments in distributed 
programming, software agents and high power processors also make the vision-based 
solutions more promising. Based on this technical review, we have adopted such a system, 
as described in the following sections. 
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3 Location Sensing System for Self-Updating Building Models 

3.1 System Framework 

Our system is designed as a vision-based location sensing that uses a combination of 
visual markers (tags) and video cameras. Among the reviewed vision-based methods, 
the algorithm proposed in TRIP [20,21] offers a suitable solution for location sensing 
in building environments. The TRIP algorithm uses optimized image processing and 
computer vision methods, and benefits from low-cost, black-and-white tags. It obtains 
in real-time the identification and location (both position and orientation data) of an 
object to which the visual tag is attached. 

Our assessment criteria, given in 2.2, emphasize that the location system should 
also provide fine-grained spatial information at a high update rate, be unobtrusive and 
scalable in terms of sensing the locations of many objects in a wide area. To provide these 
requirements, our Location Sensing System (LSS) is designed in a distributed structure, 
with the software components tied together by the Internet. Communication and data 
sharing is ruled by the Distributed Component Object Model (DCOM) protocol that 
enables these software components to communicate over a network [22]. The distributed 
structure of the LSS enables scalability and incremental growth, in addition to enhanced 
performance derived from parallel operation. 

Figure 1 shows the framework of the LSS. It comprises four main software com- 
ponents: Application Server, Database Server, User Interface Server, and Target Recog- 
nition and Image Processing (TRIP) Clients. The Application Server is the central unit 
that controls the distributed components of the system, and performs resource sharing 
and load balancing among the clients. Resources are the available computing and sensor 
devices in the system. The Network cameras (NetCam) are used as sensors that own 
dedicated IP addresses, and act like network devices. 




Distributed Components Server-Side 



Fig. 1 . Structure of the LSS. 
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TRIP Clients are the consumers of sensor and computing resources that run these 
client programs. A TRIP Client acquires images from the network cameras and applies 
image processing to extract the identification and location of the tagged objects. TRIP 
Client programs are implemented on different computers (computing resources) that 
may be distributed across a facility, and their number can be increased as needed. The 
results obtained from multiple clients are combined in the Application Server, which 
is also responsible for controlling the status of the cameras and TRIP Clients. It in- 
forms the operator (a facility manager, for example) of inactive sources and dynamically 
assigns active cameras to active TRIP Clients by taking their workload feedback into 
consideration. This arrangement minimizes operator overhead. 

All data regarding TRIP Clients, cameras, object information and system parameters 
are stored in the XML (Extensible Mark-up Language) format. The Database Server 
provides remote access to XML data for other components of the LSS. The User Interface 
Server is responsible for the communication between an operator and the system. It 
presents combined location sensing results (object identifications and locations) to the 
operator and enables the adjustment of system parameters from web browsers. 



3.2 Processing Steps of the LSS 

The primary goal of the LSS is to collect visual data from the sensors, and extract object 
identification and location information in a wide area. Figure 2 demonstrates the process 
flow that takes place in the system to convert images to location data. Raw camera 
data is acquired and subsequently transferred to the Image Processing unit through the 
Camera Interface. Object IDs and locations are extracted by the Image Processing unit 
and are conveyed subsequently to the Coordinate Translation. Coordinate Translation 
transforms the location data with respect to camera coordinates to the location data with 
respect to room coordinates. Multiple Image Processing and Coordinate Translation 
units run parallel in the LSS, as they are implemented within distributed TRIP Clients. 
The ID and location data extracted from various camera devices are combined in the 
Object Fusion phase utilizing current and previous object information. Final results are 
transformed into data packets for convenient data communication, and transferred to the 
main system, ECU, for model construction. These data are also transferred to the Object 
History database for the further processing of the Object Fusion. 



3.3 Camera Interface 

Network cameras are used as sensor devices for capturing images of the environment, 
where each Network camera owns a dedicated IP address, and acts like a network de- 
vice. Network cameras are inexpensive, off-the shelf products. They have web servers 
embedded inside that convey images to the consumers through HTTP. Camera Interface 
is executed in the TRIP Clients for each network camera assigned by the Application 
Server. This unit acts like a hardware interface and isolates the software parts of the 
system from the hardware units. Thus the effect of any change of the hardware to the 
overall system is minimized. Camera Interface performs the communication and im- 
age acquisition with establishing a connection to the network camera as a web client. 
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where the required communication parameters are retrieved from the Camera Database 
(Figure 2). 

3.4 Image Processing 

Image Processing, executed within the TRIP Client component, acquires images from 
the Camera Interface, and applies image processing algorithms for location sensing 
(Figure 2). The system works with circular black-and-white TRIP coded tags that can be 
generated with an ordinary laser printer (Figure 3). Patterns on the circular partitions of 
the tag determine the ID code, position of synchronization sector (starting point), actual 
length of the radius of the tag in millimeters and even-parity bits. 




Even-parity sectors 
Synchronization sector 



Fig. 3. Sample visual tag with TRIP code: 10 201 1 221210001 (even-parity = 10, radius length = 
58mm., ID = 18795) [21] 



The Image Processing unit enhances the original TRIP system [20] by integrating 
additional algorithms that make it suitable for the building environments. The original 
system was implemented on images captured by digital cameras that provide uncom- 
pressed, high quality data. However, in the real world, working on raw images is not 
applicable in distributed environments such as buildings. Network cameras, like other 
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digital video devices, are designed to convey images as fast as possible for consumer 
comfort and therefore apply compression on the original images prior to transmission. 
To overcome the compression artefacts, two enhancement algorithms are integrated. The 
new enhanced method compensates the artefacts concurrently suppressing the impulse 
noise generated by the network cameras, adapts the system for distributed platforms, 
and augments the performance as described in the following. 



Original Image Processing Method. The original method comprises “target recogni- 
tion” and “pose extraction” algorithms that turn the input images of tags into location 
and identification data. The “target recognition” algorithm determines the identification 
and geometric properties of the projections of TRIP tags. Since the projection of a circle 
in an image generates an ellipse, TRIP tags’ circular patterns are observed as elliptical, 
and parameters of these ellipses are extracted from the image. The outermost ellipse of 
a detected tag is marked as “reference ellipse” and its parameters are used for the pixel 
sampling procedure of TRIP code deciphering. The intensity values of the pixels at point 
locations around the reference ellipse determine the entire TRIP code. The TRIP code 
is finally validated with the even-parity check bits [ 20 ]. 

“Target recognition” returns the ID number, radius of the tag, the position of syn- 
chronization sector and the parameters of the reference ellipse for each identified TRIP 
tag. The “pose extraction” algorithm takes these values as input in order to determine 
the 3D position and orientation of TRIP tags with respect to the camera. The algorithm 
implements a transformation that back-projects the elliptical border of a tag lying on 
the camera image plane into its actual circular form lying on the centre of the target 
plane. The reverse projection makes the camera image plane become parallel to the tar- 
get plane and retrieves the 2D orientation of the TRIP tag by giving out the angles around 
the camera’s axes X and Y, a and j3 respectively. The position of the synchronization 
sector is used to extract the final component of the orientation, angle around Z-axis, 7 . 
The distance between the camera and the target plane, d, is computed using the radius 
length of the tag. Thus, in addition to orientation, position vector [P x , P yi d] T is also 
generated where P x and P y are computed from the central point of the reference ellipse. 



Enhanced Image Processing Method. In our application, network cameras are apply- 
ing wavelet transformation with a « 10 : 1 compression ratio. This process generates 
smoothed input images for the TRIP Clients and causes the tag images to lose sharp- 
ness. To compensate this, TRIP Clients apply an “adaptive sharpening algorithm” [23] 
on the input image prior to target recognition (Figure 4). This method, first, restores the 
original image by an un-sharp masking process which is naturally affected by noise and 
compression-based ringing artefacts. The algorithm then minimizes these artefacts by 
combining the adaptively restored image with the original. 

In addition to camera artefacts, an increase in the distance of tags to camera reduces 
the pixel resolution of the tag images and makes the TRIP codes harder to decipher, even 
though the tags are detected and reference ellipses are extracted properly. To solve the 
problem, “edge-adaptive zooming” [24] is applied locally to spurious TRIP tags from 
which the TRIP code could not be deciphered or validated. “Edge-adaptive zooming”, 
as opposed to its counterparts such as bilinear and cubic interpolation, enhances the 
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Image Processing 



NetCam 




Fig. 4. Enhanced image processing algorithms. 



discontinuities and sharp luminance variations in the tag images. This procedure is 
repeated until the “target recognition” succeeds or the zoomed image region loses its 
details (Figure 4). The latter case indicates a false alarm or an unidentified tag. 

3.5 Coordinate Translation and Object Fusion 

Coordinate Translation is executed within TRIP Clients after Image Processing (Figure 
2). The outcomes of Image Processing are a position vector and orientation angles with 
respect to camera coordinates from which the processed image is acquired. Coordinate 
Translation converts these to the location data with respect to room coordinates. 

The location of each camera with respect to the room coordinate system is stored in 
the Camera Database. This location data involves the coordinate-rotation vector, [CR a , 
CRp, Ci? 7 ] T , that overlaps the axes of the coordinate systems, thus when combined 
with the original rotations ( a , /?, 7 ), gives out the orientation values with respect to the 
room. The location data also involves the coordinate-translation vector, [CT X , CT y , 
CT z ] t , that aligns the origin of the camera coordinate system to the origin of the room 
coordinate system, and eventually gives out the position vector with respect to the room 
when combined with the original position vector, [ P x , P y ,d] T . 

Object Fusion, implemented in the Application Server, combines the object identi- 
fication and location data acquired from parallel-running TRIP Clients (Figure 2). The 
same object may be detected by multiple cameras, each of which is assigned to different 
TRIP Clients. This may generate repeated records in the system. Object Fusion com- 
bines these reiterated data based on identification codes and room coordinate locations. 
Furthermore, TRIP Clients attach time stamps on each extracted object’s location data. 
In cases of inconsistency. Object Fusion uses previous object data to perceive the correct 
identification or location information, and generates the final, unique object information. 

4 A Demonstrative Test 

To evaluate the performance of the LSS, a demonstrative test was performed to observe 
the identification and location accuracy. The test configuration was designed to address 
system limitations. One limitation is the “distance” of the tags from the camera. An 
increase in distance reduces the resolution of the tags that makes the pixel sampling 
unable to locate the circular regions within the tag image. A second limitation is the 
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“incidence angle” between the normals of the target plane and the image plane. As the 
incidence angle increases, the elliptical properties of the tag image projections become 
more difficult to acquire. 

In our test, 16x16 cm tags were located at 3 different distance values (2,3,4 m) 
and, for each distance, 3 different incidence angles were evaluated (0°, 30°, 60°). 30 
sequential readings for each location were recorded using the TRIP Client program and 
a network camera with 1/3’ CCD sensor and 720x486 resolution. As camera artefacts 
affect input images in changing magnitudes and spatial values, 30 sequential samples 
were taken for each designated location. 

The test was performed with the “original” and“enhanced” image processing meth- 
ods as described above. Identification percentages are given from the sequential reading 
results in Table 2. 

Table 2. Percentage of identified tags as a function of distance and angle for the “original” and 
“enhanced” methods. 
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In addition to identification, location results were also observed. For the enhanced 
method, the position and orientation data are within a maximum error range of ±10 cm 
and ±10 degrees for 3 m distance. The error rates show an increase to ±20 cm and ±35 
degrees for located objects within 3 and 4 m respectively. 

5 Conclusion and Future Directions 

We have presented a location sensing system to support self-updating building models 
for building automation applications. The implemented system has some drawbacks 
inherited from the general disadvantages of the visual methods as it requires line-of- 
sight between the camera and tags, and its performance is dependent on the cameras’ 
image quality. However, the results obtained from our location sensing system suggest 
that vision based location sensing, when enhanced with software methods and integrated 
with appropriate hardware, is a promising technology suitable for spatial domains such 
as facilities and buildings. 

The implemented sensing system is still open for improvements. We expect that in 
the future the tag size can be reduced and the effective distance of the system can be 
augmented. Subsequently, utilizing reference tags will facilitate the automatic calcula- 
tion of coordinate translation data, allowing the relocation of cameras without manual 
system reconfiguration. Moreover, future implementations will enable multiple clients 
to use one camera, thus increasing system speed and efficiency. 
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Abstract. An important component of ubiquitous computing is the ability to 
quickly sense the dynamic environment to learn context awareness in real-time. 
To pervasively capture detailed information of movements, we present a decen- 
tralized algorithm for feature extraction within a wireless sensor network. By 
approaching this problem in a distributed manner, we are able to work within the 
real constraint of wireless battery power and its effects on processing and network 
communications. We describe a hardware platform developed for low-power ubiq- 
uitous wireless sensing and a distributed feature extraction methodology which is 
capable of providing more information to the user of events while reducing power 
consumption. We demonstrate how the collaboration between sensor nodes can 
provide a means of organizing large networks into information-based clusters. 



1 Introduction 

To provide context awareness to aid effective ubiquitous computing, wireless sensor net- 
works (WSNs) are typically deployed as a data collection tool. Data is funneled through 
an ad-hoc network of low-power microprocessors with embedded sensors to a central- 
ized base station. This base station has infinite power and memory resources compared 
to the distributed sensor nodes enabling complex processing on the incoming data as 
well as present it to a end user in a meaningful manner, [5,6,1]. When dealing with 
small amounts of information at controlled intervals, as in Kyker’s work [5] which sam- 
pled radiation levels or Polastre et al. ’s work [6] which sampled various environmental 
conditions, a centralized data poll is feasible on the limited available bandwidth. 

Yet, inherently, this is not a scalable architecture. System bandwidth is proportional 
to the number of nodes connected to the base station, as well as dependent on the 
communication power of these nodes. These nodes are typically relatively few because 
the wireless sensor networks are placed in large and inaccessible areas. The location 
of the base station is often far from the center of activity. The full processing power 
of the base station cannot be utilized because not enough data can be received, and the 
full sensing power of the nodes cannot be utilized because of the bottleneck at the base 
station. In a recent field test [1], Arrora et al. illustrated how centralized data fusion is 
capable of flooding the network layers and provide little to no information to the end 
user. 

Another limitation of WSN is their inability to readily incorporate image sensors into 
their system. Yet as humans, we rely heavily on our sight because it provides rich orthog- 
onal information compared to our other senses. There is much to leverage in research 
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devoted to the exploitation of the information provided by images, from clustering of 
images to recognition/tracking objects within the images or video. 

Recently, driven by the incorporation of cameras in cellular phones, there exists low 
power CMOS image sensors which consume approximately 2 mJ to capture an image 
and have a form factor of less than 10 mm x 10 mm x 5 mm. Their form factor can 
be easily interfaced with current WSN platform. Images captured by these sensors are 
around 100,000 bytes and require a larger memory and processing power than some 
sensor node platforms provide. Recent advances in XScale processing (e.g., Stargate 
[12] and PASTA [8]) provide adequate interfaces to be integrated into a wireless sensor 
node. 

Clusters of computers working collaboratively are capable of providing greater com- 
putation than the fastest single computer. The strength of parallel or distributed com- 
puting has been shown throughout the high performance computing community. Yet, 
many of the current WSN deployments do not attempt to utilize the processing power of 
the distributed nodes. In many ways, the nature of WSNs lends itself easily to parallel 
and distributed computing because the sensed data is naturally distributed across the 
network. 

The network layers of WSNs have been optimized to handle transfers of small 
amounts of data between nodes. While investigations have begun on how to transfer 
large amounts of data for reprogramming of the sensor nodes [10,3], these solutions 
can be inadequate for sensor system tasks due to time constraints. For example, [10,3]’s 
methods take minutes to traverse a few hops. By the time the image is received at the 
base station, the event of interest (i.e., person) could have traveled far away and be out 
of detection range. 

In WSNs, it is essential to take advantage of the distributed processors because 
of the time and power constraints of these type of systems. There exists a tradeoff 
between the amount of instantaneous information the system can provide and how long 
the system can provide information in general. Assuming infinite bandwidth, constantly 
shipping the maximum amount of information back to a base station will provide greater 
resolution of the environment, but will also result in a shorter lifespan of the sensor 
network. Reducing the amount transmitted will increase the lifespan, but information 
will be missed. Introducing user feedback to this system can provide a varying amount of 
information, but the time constraint of monitoring real-life events most likely will be too 
tight to utilize this information. In situ processing is necessary to improve the tradeoff 
between the resolution of information and the lifetime of the sensor networks [2], In 
situ processing allows for sensor nodes to immediately comprehend the environment 
through their on board analysis of the sensor data. Through this analysis, they are able 
to perform specific actions to better inform the end user, by gathering more detailed 
sensed data, performing more complicated computation on the sensed data, or directing 
the activities of other nodes. These actions can be performed with little delay, capturing 
information that would be otherwise lost on time- sensitive events. 

We present a distributed approach to achieving context awareness using a set of low- 
power wireless sensor nodes. We demonstrate our algorithm through a sample sequence 
and discuss the power and latency reduction which result from approaching this problem 
in a distributed manner. 




138 



T.H. Ko and N.M. Berry 



2 Related Work 

Only recently have cameras with the power characteristics for wireless small devices 
existed. Two examples of embedding vision within small devices is the work done by 
Rowe, et al [9], and Viola and Jones [11]. Rowe, et al have used an embedded 8-bit 
processor and image sensor to direct the motion of a small robot. Their research has 
proven the capability of rudimentary vision techniques in small embedded devices. We 
have extended this idea of embedded vision to the collaboration of multiple sensor 
nodes to extract information in a distributed environment. Another example system is 
Viola and Jones’s work in face detection on the XScale platform [11]. While their work 
demonstrates the size and complexity of an algorithm we can expect from an XScale 
platform, we recognize that even their work would be too computational expensive for 
WSNs. To be viable in WSNs, the XScale processor would need to remain in sleep 
mode as much as possible to conserve power. The architectural scheme used [13] and 
[11] reduces the required computation by quickly ruling out unlikely search spaces, a 
necessary concern when working within the power constraints presented by WSNs. 

Another body of work led by Yang [14] has focused on crowd monitoring within 
WSNs. While not currently implemented on small embedded devices and assumes a 
known location and orientation of the cameras, the underlying research provides in- 
sight into distributed computer vision. Other work on the use of multiple cameras are 
Rahimi’s work [7] in the simultaneously tracking of a single user and the resolution of 
node locations and orientation, and Khan’s work [4] in tracking moving cars along a 
highway. While these approaches have taken a centralized approach in the acquisition 
and processing of images, their work illuminates the challenges of comparing images 
taken from different cameras. 

3 Our Approach 

To gain the necessary situational awareness in various applications mentioned, we have 
developed a distributed hardware and software architecture for detecting events. We 
maintained the following principles in our design: 

- scales across hardware architecture, 

- adapts processing to environment, 

- and, extends to other sensor suites. 

We have focused on these three principles from the limitations currently within WSN 
technology [2], Many of the algorithms currently developed for sensor networks do 
not take into account the constraints of power and therefore require large processing 
nodes to achieve their tasks. It is our hope that algorithms should not be dependent on 
the availability of large processing nodes which limits the domains this work would be 
applicable in. Rather, we would like an algorithm that scale in robustness, accuracy, 
and speed when provided with more computational power. In addition, the architecture 
proposed should take into account that unreliable channels of communications can be 
expected as well as the frequency and duration of events cannot always be predicted. 
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Fig. 1 . The ERI platform and additional hardware. A low-power image sensor capable of capturing 
images at around 2 MJ and the XScale based PASTA board created by USC 1ST 



The last design principle calls for the separation of the information processing and the 
information extraction. 

To this end, we approach the problem by: 

- distributing tasks Each node does not try to solve the entire problem on its own. 
Rather, neighboring nodes help each other out, by providing their neighbors with 
the information they have extracted so that their neighbors may extract different 
information about the event or to extrapolate higher-level data to the base station. 

- forming information-based network organization Clusters are formed and commu- 
nications determined due to which nodes needs to share information rather than 
based proximity in location. 

The following sections describe the hardware and system constraints we are working 
within, and a distributed, decentralized approach to feature extraction. 

3.1 Hardware Platform 

We present our hardware platform to illustrate the realistic constraints in which our 
distributed identification system must work within. The ERI platform [5] is a low-power 
modular sensor node which is composed of several different pluggable hardware modules 
(e.g., power, processor, sensor, gps, and radio). The critical hardware specifications of 
the system are a Cygnal 805 1 processor, 2K RAM, and 900 MHz 1 Mb/s radio. The 
current implemented modules is designed to be deployed and survive for approximately 
one month on 2 AA batteries with a low duty cycle of less than 1%. 

Due to its modular design, we are able to readily adapt our hardware to the needs 
of different applications as well as quickly upgrade our hardware as new technologies 
arrive. It enables us to quickly integrate the newest technology in image sensors to the 
hardware as well as extra processing power and memory as needed. The type of hardware 
additions needed for embedded image processing are shown in Figure 1. The advent of 
CMOS image sensors have made adding images to WSNs possible, allowing images to 
be captured with as little as 2 mJ. Various companies (e.g., Fujitsu, Agilent, Omnivision, 
etc.) are continuously providing more power efficient and smaller form factors on their 
image sensors. Because the imagers capture data at orders of magnitudes larger than the 
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Fig. 2. The response of a histogram-based motion detector on a series of frames where a person 
enters into the scene, pauses for a moment, leaves, and enters again. Sample frames are shown 
below the histogram. 



more traditional sensors such as temperature and acoustic, we require a larger processing 
unit such as the PASTA board created by USC ISI [8] shown in Figure 1. The PASTA 
board is built around an XScale processor which runs at 400 MHz and is capable of 
running Linux. 



3.2 Software 

The limited battery power of these units tightly constrains the design of the identifi- 
cation algorithm we develop. We use a cascaded approach to minimize the amount of 
computation both to conserve power and increase the speed of the overall system. This 
method begins by detecting motion in images and segmenting the event from the back- 
ground based on this motion. Features are used to correlate independently sensed events 
across the nodes to determine which sensed data corresponds to the same global event. 
More features are then extracted from the event across different sensor nodes to reduce 
computation. 



Motion Detection. To minimize computation and reduce battery consumption, at each 
sensor node we use motion cues as a first pass filter before entering into the more 
computationally intensive part of our system. We have implemented a more cost effective 
approximate motion detector which eliminate the need to store a comparison image. A 
histogram is created which places the grayscale pixels into a small number of bins. 
Changes in the histogram is interpreted as detected motion. Figure 2 illustrates how 
storing 20 bytes of data rather than 25,344 bytes is sufficient for detecting motion in an 
image. As the person moves within the scene, the resulting histogram changes perceivably 
more than when no one is in the scene. It is possible to compute this histogram as the 
image is streaming in from the CMOS sensor. 
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Fig. 3. Image Segmentation process. 



Image Segmentation. If motion is detected in an image, we segment the area of motion 
from the image. Modeling the background is a challenge when dealing with uncontrolled 
lighting changes and calibration resulting often times in a computationally complex 
and memory consuming algorithm. This work chooses a simplier approach resulting in 
greater noise in image segmentation. The sacrificed accuracy at one node to conserve 
power has the potential to be overcome by the additional images captured from multiple 
cameras. With two frames, we can segment where the pixels have changed between 
background and foreground. With three frames, we can compare pixels and segment 
which parts change due to the background image being occluded by the foreground 
object and the background image reappearing due to the motion of the foreground object 
to improve image segmentation. By distinguishing these two types of changes, we are 
able to extract the pixels which are associated with the event from the middle frame. 
We extract a bounding box around the motion for a very rough estimate of the object 
location in the image. By analyzing three images, we are able to segment the scene 
without capturing and maintaining a model of the background. 

Figure 3 illustrates the motion detection algorithm. We begin by detecting and thresh- 
olding pixel changes within 2 frames. The motions detected are bounded and the inter- 
secting area between the frames is used as the segmented event. 



Feature extractions. Features of this segmented area are used to characterize an event 
as it traverses the environment. In many sensor systems, each sensor node determines its 
computation independently from each other. In this work, we will direct the computation 
of the sensor node by sharing the previously extracted features on other sensor nodes. 
There are two aspects of collaboration we explore in this work. The first is distributing 
the extraction of features across the network. As an event traverses the space, each sensor 
node extracts a different feature than those it has received. This results in a reduction of 
computation proportional to the number of features that are needed to be extracted. The 
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second aspect will be discussed in the following section. We work under the paradigm 
that each sensor node has a limited time to compute features. The choice of features 
is determined by messages broadcast by other sensor nodes, containing the previously 
extracted features. 



Data Association. From the features extracted, even though the images are captured 
asynchronously from unknown position and orientation, we use covariance across dif- 
ferent feature vectors to determine which features belong to the same event and which 
do not. The more variance that exist, the more unlikely the two events are the same. 



4 Results 

We validate our decentralized approach by illustrating decentralized data association 
and distributed feature extraction. An analysis of battery power consumption is provided 
to illustrate how this approach can improve the lifetime of a WSN. 



4.1 Motion Detection 

Using a histogram for motion detection provides a simple solution for systems with 
limited computation and memory. When an event occurs, the change in the histogram 
is significantly different from the naturally-induced changes from motions such as the 
wind moving branches and leaves. Figure 4 provides the results from a specific sequence, 
highlighting its ability to detect moving objects within the scene. Small objects which 
have similar coloring as the background tend to be missed by this approach. An example 
is also shown in the figure. 



4.2 Image Segmentation 

Some sample segmentations are shown in Figure 5. For some objects such as a person 
or a car, the image segmentation works well. Others, like a bike, which widens the 
segmentation and results in a large percentage of background in the segmentation will 
make correlating events from different viewpoints a challenge. This approach is also 
challenged when more than one object is traversing the same viewpoint. 



4.3 Features 

We’ve extracted a simple set of features which can quickly be done at a sensor node. 
These features are size, aspect ratio, position, and a 20 bin histogram per color channel. 
Not all features are computed at all nodes. These features were chosen for their ability 
to capture useful information for humans as well as their minimalistic computation 
requirements. 
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Fig. 4. Output of motion detector. Two of the segmented motions are shown to illustrate the type 
of motion detected. The last image illustrates how the motion of small objects are missed. 




Fig. 5. Resulting image segmentation. Two of the segmented motions are shown to illustrate the 
type of motion detected. The last image illustrates how the motion of small objects are missed. 



4.4 Data Association 

The ability of sensor nodes to autonomously associate the data coming from their own 
sensors as well as messages shared with them will ease the deployment of sensor nodes. 
This is one of the pieces which enable casually placement of the sensor nodes. In this 
work, as the sensor nodes extract histograms of the segmented event, they share these 
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Fig. 6. Size of detected events from two different asynchronous cameras. 




Fig. 7. Aspect Ratio of detected events from two different asynchronous cameras. 




Fig. 8. Position of detected events from two different asynchronous cameras. 



features with each other. The covariance of the feature are calculated and summed 
together. We threshold the covariance and determine that events with little variance are 
most likely the same event and those with greater variance are most likely different events. 
By performing in network data association, we can begin to understand the relationships 
between sensor nodes and organize our network through shared information, further 
reducing the communications. 

Various covariance values are shown in Figure 10. The original sequence captured 
by two cameras are shown to have less variance than two cameras looking at different 
events in general. 
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Fig. 9. 3 color channel histograms of detected events from two different asynchronous cameras. 
Each row has a red, green, and blue histogram of the images caputres from a single camera. 
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Fig. 10. Covariance of extracted histogram features across different events and cameras. 



4.5 Analysis 

Feature values taken from two casually aligned cameras sensor nodes which have over- 
lapping viewpoints are shown in Figure 6, 7, 8, and 9. In a centralized approach, we would 
need to extract all these features at each sensor node and transmit this information back 
to the base station. With a decentralized approach, we could distributed the computation 
across several sensor nodes, and only compute what was necessary for transmitting. 
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Placing this discussion in a realistic sensor deployment further illustrates the ad- 
vantages of our approach. Imagine a sensor deployment of 100 nodes, placed in an 
approximate 10x10 grid layout. A typical communication bandwidth of sensor nodes 
is about 1 Kbit/sec. Estimating that there are 10 sensor nodes at the edge of the grid 
which are within communication range of the base station, we can effectively transmit 
10 Kbits/sec. An average segmented event is around 500 pixels. This implies that only 
approximately 2 images can be sent back to the base station each second by the entire 
sensor network ignoring all the underlying routing communications needed. In a rea- 
sonable deployment, more than 2 cameras will see the same thing. Clearly, this is not 
a viable solution. Another approach would be to transmit the extracted features. For a 
64 byte histogram, this is around 500 bits of data, resulting the capability to transmit 
around 20 histograms per second. Given a few cameras are viewing the same event, this 
would limit the system to approximately five events. 

The decentralized approach allows us to reduce the computation by the number 
of nodes the process is distributed across and the number of features to be extracted. 
Assuming we have four features we wish to extract, we reduce the the communication 
by at least a fourth. 

5 Conclusion 

We have presented a distributed approach to extracting features that is capable of re- 
ducing the computation and communicates at the sensor nodes, two critical factors in 
increasing the lifespan of the sensor system. This work has been motivated by the need 
for ubiquitous computing which is easy to deploy and maintain as it is pervasive. The 
collaboration between sensor nodes is exploited to perform data association within the 
network to facilitate intelligent distributed feature extraction, to present the user with 
higher level information about the system using less bandwidth and power and a dis- 
covery mechanism for information-based clusters of sensor nodes. This investigation 
has spurred a more detailed look into optimally selecting features based on the previ- 
ously extracted features, to quickly build up a model of the event as well as extending 
the emergent tracking from sensor node to sensor node of the event to a finer grained 
tracking system which infers 3D positions. 
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Abstract. To realise an Ambient Intelligence environment, it is 
paramount that applications can dispose of information about the con- 
text in which they operate, preferably in a very general manner. For 
this purpose various types of information should be assembled to form 
a representation of the context of the device on which aforementioned 
applications run. To allow interoperability in an Ambient Intelligence 
environment, it is necessary that the context terminology is commonly 
understood by all participating devices. In this paper we propose an 
adaptable and extensible context ontology for creating context-aware 
computing infrastructures, ranging from small embedded devices to high- 
end service platforms. The ontology has been designed to solve several 
key challenges in Ambient Intelligence, such as application adaptation, 
automatic code generation and code mobility, and generation of device 
specific user interfaces. 



1 Introduction 

Small portable devices, such as PDAs and mobile phones, are becoming more 
widespread. As a consequence, people are expecting the functionalities provided 
by these devices to increase. GSMs with a quite extensive amount of organizer 
software, games and multimedia services are no exception, rather they are rapidly 
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becoming a default asset in everyone’s life. As devices grow more powerful, with 
respect to computing power and autonomy, we expect the software on such em- 
bedded devices to become more advanced too. Additionally, at home and at 
work, embedded systems start getting a foothold. Home automation systems for 
example are no longer the rare expensive gadgets they used to be. Observing 
these trends, the 1ST Advisory Group (ISTAG) [1] has concluded that within 
a few years, real Ambient Intelligence (Ami) environments will emerge. In such 
environments, devices will communicate and interact independently, without im- 
mediate user interaction. The devices will make decisions based on a variety of 
factors, including user preferences and the presence of other users in the near 
neighbourhood . 

To accomplish this, devices need to be aware of contextual information within 
their environment. In order to sort out any information that may characterize 
the situation of a person or a computing device, it is a must to structure the large 
amount of data so that synthesizing of valuable information from varying sources 
is possible. The resulting structured data is called the context of the device. The 
context thus describes all the relevant information to allow software on a device 
to semi-automatically interact in a well-defined way with its environment. The 
context model proposed in this paper will be used in the CoDAMoS project [2] 
to solve several key challenges in the area of Ambient Intelligence by supporting 
context-driven adaptation of mobile services. 

A short overview of the context requirements to support an Ambient Intel- 
ligence environment is given in section 2. In section 3 we describe related work 
on the modeling of context and their shortcomings. We then present our context 
ontology proposal in section 4 and end this paper with a conclusion and future 
work in section 5. 

2 Requirements for Ambient Intelligence 

The aim of Ami computing infrastructures is to provide intelligent services to 
the user by targeting software towards a specific context before delivery, and 
adapting it to a changing context after delivery. More specifically, it will require 
integration of state-of-the-art concepts within several computer science research 
domains, such as application adaptation, code mobility in nomadic environ- 
ments, automatic code generation and context-aware user interfaces. Therefore, 
detailed context information should be provided to be able to accomplish these 
objectives, resulting in the following requirements for a basic context model: 

R.l Application adaptivity: With dynamic environments and changing con- 
texts in mind, it is important that applications support some degree of adap- 
tivity. Hence, up-to-date information about the user, available services and 
host platforms, network connectivity, time, location and other sensed data 
should be included in the context model to assist appropriate application 
adaptation. 

R.2 Resource awareness: As resources on embedded devices are sometimes 
too limited to run certain services, sufficient information about maximum 
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and currently available resources, such as processing power, memory, battery 
life time and bandwidth, is needed to consider service adaptation or service 
relocation for lowering resource usage. 

R.3 Mobile services: When the location of a user changes over time, whole 
services or parts thereof must be able to migrate almost instantaneously. 
Therefore, detailed information about the execution platform should allow 
autonomous migration when, for example, compatible virtual machines exist 
on two different platforms. 

R.4 Semantic service discovery: Semantic discovery based on context in- 
formation enhances key-value based matching protocols by automatically 
incorporating search criteria that are relevant for the current user or device. 
R.5 Code generation: By specifying the operating system, drivers, software 
libraries and virtual machines on an embedded device, code generation can 
be used to generate a dedicated implementation of a high-level service spec- 
ification to broaden the range of devices on which services can be deployed. 
R.6 Context-aware user interfaces: Services at the end-user side that have 
to work within tight resource boundaries on mobile devices need user inter- 
faces that are adapted to their context of use. User interfaces can further 
adapt dynamically if the context changes over time. 

These requirements allow mobile services to be designed in a generic way, with 
functional variations to be generated for a range of platforms, but also so that 
they can adapt to context elements such as other services and resources available 
in their context. 



3 Related Work 

Context-awareness is a hot research domain, with interesting topics such as 
context modeling, formal context languages for specifying facts and interrela- 
tionships, and infrastructure support for querying and reasoning on contextual 
information using an inference engine. 

The Context Ontology Language (CoOL) [3] is an ontology-based context 
modeling approach, which uses the Aspect-Scale-Context (ASC) model where 
each aspect (e.g. spatial distance) can have several scales (e.g. kilometer scale or 
mile scale) to express some context information (e.g. 20). Mapping functions exist 
to convert context information from one scale to another. CoOL is very useful 
for describing concepts with an inherent metric ordering such as in requirement 
R.2, though less practical for expressing scales for aspects as in requirement R.l. 
Chen et al. [4] propose a context broker architecture (CoBrA) using an ontology 
to describe persons, places and intentions. Less emphasis is put on the notion 
of services and related aspects, such as user interfaces and mobile devices on 
which these services are deployed, needed to fulfill the above requirements. Gu 
et al. [5] present a service-oriented context-aware middleware (SOCAM) based 
on a context model with person, location, activity and computational entity (such 
as a device, network, application, service, etc.) as basic context concepts. The 




Towards an Extensible Context Ontology 151 



notion of mobile services seems to be beyond the scope of this context model. 
Henricksen and Indulska [6] propose a context model that describes context based 
on several types of facts (e.g. sensed, static and profiled) subject to constraints 
and quality annotations. 

Some general description frameworks for expressing context are the Resource 
Description Framework (RDF) [7] and the Web Ontology Language (OWL) [8]. 
Other languages are built on top of these frameworks, but are more tailored to 
describing context. These include the Composite Capability/Preference Profiles 
(CC/PP) [9] and the User Agent Profiling Specification (UAProf) [10]. All have 
been used to specify context. Korpipaa et al. [11] use RDF to describe sensor and 
derived sensor data on mobile devices. CC/PP was used by Indulska et al. [12], 
but found to be too limited to describe complex context models. OWL, on the 
other hand, allows the definition of more complex context models and is used in 
several approaches [3,4,5]. 

4 Extensible Context Ontology 

Considering the fast evolution in the hardware and software industry, it is impor- 
tant that decisions made today regarding our context specification are adaptable 
and extensible. Thus, we should remain as conservative as possible, keeping open 
the options for change in our context model. We therefore opted to define a ba- 
sic, generic context ontology 1 . Ontologies provide classes of objects, relationships 
and domain constraints on their properties. By mapping concepts in different 
ontologies, structured information can be shared. Hence, ontologies are good 
candidates to express meaning within our context specification. 



4.1 General Overview 

We determined four main entities around which we built our ontology. These 
are based around the most important aspects in context information, which are 
also, sometimes partially, discussed in [13,14,15]: 

User: The user plays an important role within Ambient Intelligence. The appli- 
ances within its environment should adapt to the user, and not vice versa. 
Important properties include a user’s profile, but also his preferences, mood 
and current activity. 

Environment: The environment in which the user interacts is an important as- 
pect of the context specification. It consists of time and location information, 
and environmental conditions, such as temperature and lighting. 

Platform: This part is dedicated to the hardware and software description 
of a specific device. This includes among other things specifications of the 
processor, available memory and bandwidth, but also information about the 
operating system and other available software libraries. 

1 The current implementation of our context ontology in OWL can be found at 
www.cs.kuleuven.ac.be/cwis/research/distrinet/projects/CoDAMoS/ontology/ 
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Service: Services provide specific functionality to the user. Specifying semantic 
and syntactic information sustains easy service discovery and service inter- 
action using a well-defined service interface. 

Every device will contain its own context specification with a full description of 
its provided services, while containing pointers to relevant information on devices 
in its environment. An overview of the proposed context ontology 2 is given in 
figure 1. 
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Fig. 1. Context ontology overview 



4.2 User 

According to Dey [15], context information is only relevant if it influences a user’s 
task. This is why the user should take a central place in the Ambient Intelligence 
philosophy. Collecting information about his context enables applications and 
services to improve the usability of appliances. By accomplishing requirements 
R.l and R..6, it is possible to adapt the application as well as the user interface 
to the user’s preferences. In the ontology a distinction is made between a user’s 
preference, such as a preference for using small fonts, and his profile, containing 
facts such as gender, name and current employer. While the former may be 
subject to the current situation, the latter remains more or less static. 

When a user performs a task , this can be subdivided into several activities. 
Clerckx et al. [16] show it is possible to link context information to a model de- 
scribing the tasks a user can perform while using an application. The user fulfills 
a certain role, e.g. the project manager who is heading off to work for a meeting 
or the considerate father who picks up his children from school. Hence, people 
have different roles, but also different moods , and their personal preferences may 
depend on both issues. For example, consider the project manager, drowning in 
work, who does not want to be disturbed unless for urgent matters. Figure 2 
shows the relevant user concepts and relationships. 

2 A (*) means a relationship with multiplicity of 1 or more. 
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Fig. 2. User ontology concepts 



4.3 Environment 

A user is not a singular entity in an ambient environment. He interacts through 
various devices with his environment and with other people. This environment 
continuously provides information that allows him to make well-informed deci- 
sions or that can influence his behaviour. However, the diversity of entities that 
can be sensed or measured is enormous, if not infinite. It is therefore useless to 
try to describe everything within the surroundings of a user or a device. As user 
mobility is a key aspect within Ambient Intelligence, important concepts in this 
part of the context specification to meet requirements R.l and R.3 include: loca- 
tion, time and some environmental conditions. For example, due to some cloudy 
weather and heavy rain outside, the home automation system might decide to 
turn on the lights. Of course, this is not needed in the middle of the night or 
if nobody is at home. Figure 3 gives an overview of the ontology concepts and 
relationships for the environment. 

Another issue is that this information might be sensed by varying sources 
with different accuracies, with possibly conflicting measurements. It is very im- 
portant that we are reasonably confident about the accuracy of the derived 
information within the context specification. Note that the environment is not 
directly related to the user, but rather through the used platform: The envi- 
ronment is always sensed through a device. By explicitly specifying this, it is 
possible to reason about several properties of the sensed environment that re- 
quire knowledge of the measuring device, e.g. accuracy. 
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Fig. 3. Environment ontology concepts 



4.4 Platform 

The platform section of the ontology provides a description of (i) the software 
that is available on the device for the user or other services to interact with, and 
(ii), the hardware which specifies the resources of the device. Since the presence 
of certain hardware and software elements in devices can vary, only the relevant 
entries of the context specification are filled in. An overview of this part of the 
context specification is shown in Figure 4. 

The software installed on the device. The available software on a device is 
specified for the following reasons: (i) a service may require certain functionality 
to run, thus before deployment the service provider should be able to check 
for the presence of said functionality, and (ii) automated service builders must 
know for which software platform they are generating code. Hereby, we fulfill 
requirements R.3 and R.5. 

Software that is available on the device can be described by the following 
required parameters, or properties in the context specification: 

Name: The software component name, e.g. Java Media Framework. 

Edition: The software edition, if applicable, e.g. Enterprise Edition. 

Version: The software version, e.g. 2.11. 

While we will in general generate code [17] for a high-level API, such as the 
Java API, it sometimes may be necessary to drop to a lower level, such as 
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Fig. 4. Platform ontology concepts 



the operating system or C-library. Therefore, we define the various software 
components, ranging from the lowest level to the high-level API’s. By specifying 
an exact edition or version of the software components, we can generate code 
that is optimised for usage with these components. 

Operating system: When code is generated for this level, it is necessary for 
the code generator to know about the API offered by the operating system 
(system calls) and e.g. the C-library (if any) present on the system. Examples 
are Windows CE 3.0 and Linux-2.4. 19/glibc-2. 3. 2. 

Virtual machine: If a virtual machine is present, the code generator should 
know what type of machine-independent representation the machine accepts 
and what API is offered by it. Examples are J2EE [18], J2ME [19] and 
.NET [20]. We also need to know the vendor and version of virtual machines. 
We have shown in [21] that the JVM can have a significant influence on the 
execution behaviour of a workload (JVM + application + input), especially 
for short or small applications. 

Middleware: Besides the operating systems and virtual machines that are 
present, additional ‘middleware’ packages and libraries may have been in- 
stalled as well, e.g. a CORBA broker [22]. 

Rendering Engine: This forms the backend for rendering a user interface on 
the particular device supporting at least one modality. Examples are QT, 
Java Swing and Windows Forms. 



The hardware of the device. For software deployment purposes, it is impor- 
tant that the context specifies the hardware in the device, such as the CPU type 
and properties, the available memory, networking capabilities, etc. 
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If one wants to deploy a piece of software, obviously it should fit on the device, 
both statically, and dynamically (at runtime). Furthermore, for e.g. multi-media 
applications, it is important that deadlines can be met. Consider for exam- 
ple the decoding of a video sample. The user wants smooth rendering of the 
video- frames, making it necessary to decode each frame in time. Thus, if the 
performance of the video decoding software is too low, these deadlines will not 
be met. 

We distiguish five hardware resources that should be described in the context 
to accomplish requirements R.l, R.2, R.3 and R.5 for the device to support 
service mobility or service profiling: (i) the CPU, (ii) storage (permanent), (iii) 
memory (volatile), (iv) power, and (v) network capabilities. Each of these have 
several properties that are important for code-generation and for subsequent 
performance estimation. For the latter, we should know e.g. the cache and TLB 
size, the branch predictor used, etc., as they are used in the performance model 
we are developing. This model is needed to see if the generated code can actually 
run on the device, or if a simpler version should be instantiated. 

4.5 Services 

In several computer science domains the concept of services refers to a com- 
putational entity that offers a particular functionality to a possibly networked 
environment. Typical examples of where this term is used are in the domains 
of web services, telematics, residential gateways and mobile services. Although 
the previous domains target different users, they all have in common that these 
services are deployed to offer users a certain functionality using a well-defined 
interface, hereby providing a comfortable way for a user to achieve his goals. 
Our research is focussed on how services can dynamically interact and be aware 
of and be adapted to the current context, while keeping certain QoS aspects in 
mind. A user should be able to discover services in his environment and invoke 
them without too much hassle. This research involves requirements R.l, R.2 and 
R.3. These services might be composed of other existing services and be adapted 
to personal preferences and to the device on which it is being employed. Hence 
service descriptions should be detailed enough to make this possible. 

In figure 5 we give an overview of the main concepts regarding services. Typi- 
cally, a user wants to employ a service to accomplish a specific task. He therefore 
interacts with some I/O device (a touchscreen, keyboard, voice recognition, etc.). 
Services will generally be implemented using software modules being provided on 
a device. Hence, each platform can host several services and/or employ several 
remote services in the neighbourhood when the necessary network infrastructure 
is present. 

The level of detail at which services are described in the context specification 
of a device, depends on where these services are hosted. Each device is responsible 
for having a full description of its own services, including how it can be interfaced 
by other services. A high-level description of the services in its neighbourhood 
is more than adequate enough for doing service discovery using the context 
specification of the device to see if we are interested in a service and would 
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Fig. 5. Service ontology concepts 



like to receive more detailed information about it. Information about required 
protocols and message formats can be negotiated later on if necessary, hence 
keeping band with usage to a minimum by only sending required information. 

We therefore provide a multi-level service description, by extending our con- 
text ontology with a service ontology called OWL-s [23] . Although this ontology 
is tailored to web services and the semantic web [24], it also provides a rich 
and standardized framework to describe services in general. The Semantic Web 
community, using the OWL-s ontology specification, addresses the problem of 
having a lack of semantics within WSDL [25] service descriptions by adding a 
semantic layer based on the following concepts: 

Service profile: It provides a human readable description of the functionality 
of the service by specifying its inputs and outputs, information about the 
service provider, a quality rating and other attributes that can be used for 
service discovery. 

Service model: It describes what happens when the service is carried out, 
by giving more detailed information about the control-flow and data-flow 
involved in using the service so that the user or agent could perform an 
in-depth analysis of whether the service meets its needs. 

Service grounding: The service grounding deals with implementation details 
by specifying a communication protocol, message formats, other service spe- 
cific details. 

5 Conclusion and Future Work 

The necessity of ontologies for the establishment of context-aware pervasive com- 
puting systems is broadly acknowledged. In this paper, we presented a basic, 
generic ontology for the description of context information. 

The ontology is currently expressed in OWL, but could also be expressed in 
other ontology languages. It consists of four basic context entities: (i) user, the 
central concept in context-aware computing, (ii) environment, the description of 
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relevant aspects of the user’s surroundings , (iii) platform, the hardware and soft- 
ware of the device or devices through which a user interacts with the application 
or services and (iv) service, functionality offered in the user’s environment. 

Based on the gained experience and the feedback of industrial partners the 
context ontology will be further refined. Extensions, inevitable for the realization 
of concrete case studies for the CoDAMoS project [2], and refinements will be 
related to the presented basic ontology. 

Further attention will be paid to how emerging standardized ontologies for 
various aspects of context information will relate to the established ontology. 
When needed for our research objectives or accomplishment of case studies, 
relations between the ontologies will be specified to enhance our current ontology. 
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Abstract. Because of their severe resource restrictions and limited user 
interfaces, smart everyday objects must often rely on remote resources 
to realize their services. This paper shows how smart objects can obtain 
access to such resources by spontaneously exploiting the capabilities of 
nearby mobile user devices. In our concept, handhelds join a distributed 
data structure shared by cooperating smart objects, which makes the 
location where data are stored transparent for applications. Smart ob- 
jects then outsource computations to handhelds and thereby gain access 
to their resources. As a result, this allows smart items to transfer a 
graphical user interface to a nearby handheld, and facilitates the collab- 
orative processing of sensory data because of the more elaborate storage 
and processing capabilities of mobile user devices. We present a concrete 
implementation of our concepts on an embedded sensor node platform, 
the BTnodes, and illustrate the applicability of our approach with two 
example applications. 



1 Introduction 

Smart environments will be populated by different kinds of computing devices 
with varying processing power, energy resources, memory capacity, and different 
means for interacting with users. Handheld devices such as mobile phones or 
PDAs, computer-augmented everyday artifacts, RFID-enabled consumer prod- 
ucts, and wall-sized displays are only some of the devices that are likely to play 
a role in future smart environments. However, as pointed out by Mark Weiser 
[11], “the real power of the concept [of Ubiquitous Computing] comes not from 
any one of these devices; it emerges from the interaction of all of them.” One 
core challenge in smart environments is therefore to exploit their heterogeneity 
by building applications that make use of and combine the specific capabilities 
provided by different types of computing devices. 

This becomes even more important in connection with resource-restricted 
smart everyday objects, which usually possess only very limited user interface 
capabilities. Such a smart object can achieve very little on its own and must 
rely on remote resources to realize its services. Thereby, handheld devices are 
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well suited as resource providers for smart objects because of their complement- 
ing capabilities: while there are potentially many smart objects present in a 
smart environment that can collaboratively provide detailed information about 
the environment and the context of a user, handheld devices are equipped with 
powerful storage mediums and have more elaborate input and display capabili- 
ties. 

This paper shows how smart objects can spontaneously exploit the resources 
of nearby mobile user devices. Thereby, handhelds participate in a shared data 
structure established by cooperating objects, and serve as an execution platform 
for code from smart items. Mobile code is developed in Java, embedded into 
C code, and stored on embedded nodes, which themselves cannot execute Java 
programs. We also present an implementation of our approach, consisting of a 
programming framework, a runtime environment for executing code on nearby 
user devices, and a corresponding frontend. 

By smart everyday objects we understand everyday items such as chairs, 
books, or medicine that are augmented with active sensor-based computing plat- 
forms. Hence, smart objects can perceive their environment through sensors, 
collect information about the context of a nearby user, and collaborate with 
other objects in their vicinity by means of wireless communication technolo- 
gies. In this paper, BTnodes [2] serve as a prototyping platform for augmenting 
everyday items. BTnodes are equipped with an autonomous power supply, con- 
nectors for external sensor boards, and Bluetooth modules for communication 
(cf. Fig. 1). 




Everyday 

object 



Sensors 



Active tag with 
• power supply, 

■ communication 
modules, and 



Fig. 1 . A smart everyday object: an everyday item augmented with a sensor-based 
computing platform. 



The rest of this paper is structured as follows: Section 2 summarizes related 
work. Section 3 presents our concepts for integrating handhelds into smart en- 
vironments, while section 4 describes a programming framework for developing 
mobile code for smart objects. Section 5 presents the runtime environment for 
outsourcing computations to handheld devices. Section 6 evaluates our imple- 
mentation, and section 7 presents two example applications. Section 8 concludes 
the paper. 
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2 Related Work 

Hartwig et al. [3] integrates small Web servers into embedded devices and enables 
people to interact with them using mobile phones and a WAP (Wireless Applica- 
tion Protocol) interface. In this paper, we focus on the integration of handhelds 
into sets of cooperating objects, not on the interaction with a single device. We 
also want to enable smart objects to outsource code for energy-consuming com- 
putations to handheld devices in order to spontaneously exploit their resources. 
This does not necessarily require user interaction but can instead be transpar- 
ent for a user. Furthermore, by outsourcing Java code to a mobile device it is 
possible to support more complex user interfaces than with WAP pages. 

Aglets [6] is a programming framework for mobile agents based on Java 
technology. Whereas mobile agents migrate from execution platform to execution 
platform transferring their code, data, and execution state, we only transfer code 
from a smart object to a nearby handheld device. Other data are not transmitted 
but are made available to all cooperating objects by means of a distributed data 
structure. Furthermore, programming frameworks like the Aglets rely on RMI 
for code shipping and a Java Virtual Machine (JVM) that must run on every 
node. In our case, the embedded platforms are so ressource-restricted that they 
usually do not support a full-fledged JVM. Instead, virtually all embedded sensor 
node platforms are programmed using the C programming language. 

The Stanford Interactive Workspaces project [4] introduces a tuplespace- 
based infrastructure layer for coordinating devices in a room. This tuplespace is 
centralized and runs on a stationary server, whereas we distribute a shared data 
structure among cooperating smart objects and handheld devices, and do not 
assume that there is always a powerful server in wireless transmission range. 

Want et al. [10] augments everyday items with passive RFID tags and thereby 
provides them with a representation in the virtual world. Other passive tagging 
technologies such as barcodes and two-dimensional visual codes have also been 
used to link real and virtual worlds by attaching them to everyday things [5]. 
Although handhelds with attached scanning devices are used to read out those 
tags, an application associated with an object is usually provided by a service 
in the background infrastructure. In our approach, we do not rely on an always- 
available background infrastructure link. Instead, because of the active tagging 
technology used to augment everyday items, we enable smart objects to offer 
their services independently from a backend infrastructure. 

3 Basic Concepts and Architecture Overview 

In our vision, smart environments are populated by smart objects that provide 
context-aware applications to nearby users. Due to their resource restrictions, 
smart objects thereby need to cooperate with other objects, for example, dur- 
ing the context-recognition process in order to exchange and fuse sensory data. 
To enable cooperation among different devices, smart objects establish a shared 
data space for exchanging sensor values and for accessing remote resources. We 
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have implemented such a shared data structure as a distributed tuplespace for 
the BTnode embedded device platform 1 . Thereby, each node contributes a small 
subset of its local memory to the distributed tuplespace implementation. Using 
the tuplespace as an infrastructure layer for accessing data, the actual location 
where data is stored becomes transparent for applications. The data space hides 
the location of data, and tuples can be retrieved from all objects that coop- 
erate with each other. Consequently, an application that operates on data in 
the distributed tuplespace can be executed on every device participating in that 
shared data structure. The actual node at which it is executed becomes irrele- 
vant. Hence, when a handheld device joins the distributed tuplespace shared by 
cooperating smart objects, applications developed for a specific smart item can 
be executed also on the handheld device. 

We have realized our concepts for integrating handhelds into environments 
of cooperating smart objects in a software framework called Smoblets. The term 
Smoblet is composed of the words smart object and Applet , reflecting that in 
our approach active Java code is downloaded from smart objects in a similar 
way in which an Applet is downloaded from a remote Web server. Fig. 2 depicts 
the main components of the Smoblet system: (1) a set of Java classes - the 
actual Smoblets - stored in the program memory of smart objects, (2) a Smoblet 
frontend that enables users to initiate interactions with nearby items, (3) a 
Smoblet runtime environment for executing Smoblets on a mobile user device, 
and (4) a distributed tuplespace implementation for smart objects and handheld 
devices. 




Fig. 2. Overview of the Smoblet system. 



Smoblets. The code that is transferred to a nearby user device - in the 
following referred to as Smoblet - consists of Java classes that are developed on 

1 Please refer to http: www.inf.ethz.ch/~siegemun/software/Cluster 1 Tuplespace. pdf for 
a more detailed description of our implementation. 
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an ordinary PC during the design of a smart object. However, as the smart ob- 
jects themselves cannot execute Java code, Smoblets encapsulate computations 
that are designed to run on a more powerful device but still operate on the data 
basis of the smart object that provided the code. The sole reason for storing 
Smoblets in program memory is the memory architecture of many embedded 
sensor node platforms, which often have significantly more program memory 
than data memory. The BTnodes, for example, offer 128kB of program and 
64kB of data memory. Only about half of the program memory on the BTnodes 
is occupied by typical programs, which leaves ample space for storing additional 
Java code. 

Front- and backend. The Smoblet backend is responsible for executing 
downloaded code on a handheld device. It also protects the user device from 
malicious programs and enables downloaded Java classes to access data on other 
platforms by providing an interface to the distributed tuplespace implementa- 
tion. In contrast, the Smoblet frontend helps users to search for smart objects 
in vicinity, to explicitly initiate downloads, and to customize the behavior of the 
Smoblet backend system. 

Distributed tuplespace. The distributed tuplespace is the core component 
for integrating handhelds into collections of cooperating objects. Its main pur- 
pose is to hide the actual location of data from an application, which makes it 
possible to execute code on every of the cooperating nodes. Hence, it is possible 
to design applications that are executed on a handheld device but still operate 
on the data basis of the smart object the code originates from. The distributed 
tuplespace also allows cooperating objects to share their resources and sensors. 
The memory of other objects, for example, can be used to store local data, and 
it is possible to access remote sensor values. In our application model, smart 
objects specify how to read out sensors and write the corresponding sensor sam- 
ples as tuples into the distributed data structure, thereby sharing them with 
other objects. Please refer to [8] for a more detailed description and an eval- 
uation of our tuplespace implementation for the BTnodes. In order to build a 
running Smoblet system, we have ported our implementation to Windows CE. 
This allows handheld devices to participate in the shared data structure. 



4 The Smoblet Programming Framework 

Besides the components for exchanging and executing Smoblets, the Smoblet 
programming framework supports application developers in realizing the cor- 
responding code for smart objects. Deploying Smoblets involves four steps (cf. 
Fig. 3): (1) using a set of helper classes, Java code containing the computations 
to be outsourced is implemented on an ordinary PC, (2) the resulting class files 
are embedded into C code and (3) linked with the code for the basic application 
running on the embedded sensor node, and (4) the resulting file is uploaded to 
the smart object’s program memory. 

In the following, we concentrate on the development of the actual code that 
is outsourced to a nearby mobile user device. This is done on an ordinary PC 
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Fig. 3. The process of deploying a Smoblet on a sensor node. 



using Java and a set of supporting classes provided by our framework. Among 
other things, the framework provides classes for retrieving information about 
the current execution environment of a Smoblet, for controlling its execution 
state, and for exchanging data with nearby smart objects. We distinguish be- 
tween two application domains for Smoblets: applications in which smart ob- 
jects transfer entire user interfaces to nearby handhelds in order to facilitate 
user interaction, and applications for outsourcing computations in order to ex- 
ploit a handheld’s computational abilities, without requiring user interaction. 
These two application domains are represented by the GraphicalSmoblet and 
BasicSmoblet classes, which directly inherit from the superclass Smoblet. All 
user-defined code that is to be outsourced to a nearby handheld must inherit 
from these classes. This can be seen in Fig. 4, which shows selected parts of a 
Smoblet for collecting microphone samples from nearby smart objects. 

There are four basic categories of methods provided by the Smoblet class 
and its subclasses: (1) informational methods, (2) methods for controlling the 
execution state of a Smoblet, (3) event reporting functions, and (4) functions 



import smoblet.*; 

public class MicCollector extends BasicSmoblet { 

5: public String getSmobletName() { 

return “MicCollector” ; 

} 

public boolean isAutoStart() { 
return true; 

11 : 1 

13: public boolean onlnitQ { . . . } 

public boolean onRunQ { 
while (extractFeatures) { 

17: res = consumingScanTupleDts(micFeature, 20); 

Thread. sleep(2000); 

} 

return true; 

21 : } 

23: public void onHomeDeviceLostQ { . . . } 

} 



Fig. 4. Selected methods from a Smoblet without graphical user interface. 
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for accessing data in the distributed tuplespace. Informational methods provide 
information about the environment of a Smoblet and about the Smoblet itself - 
for example its name, a human-readable description of its functionality, informa- 
tion about its requirements regarding a host platform, and whether it should be 
automatically started after download to a handheld device (cf. Fig. 4 lines 5-11). 
These informational methods are mainly used by the tool that embeds Smoblets 
into C code for storage on an embedded platform. This tool for converting Java 
files loads Smoblet classes, retrieves information about them, and stores these 
information in addition to the encoded class files on a sensor node. Consequently, 
when a user wants to lookup Smoblets on a particular smart object, it is not 
necessary to download the complete Smoblet to get this information. Instead, 
the object shares these data by means of the distributed data structure with 
nearby handhelds. 

The methods controlling a Smoblet ’s execution state are executed by the 
runtime environment on a mobile user device after the code has been successfully 
downloaded from a smart object. The onlnit method, for example, is executed 
immediately after a Smoblet has been started, followed by the onRun, onAbort 
or onExit methods (cf. lines 13-21 in Fig. 4). Every Smoblet is executed in a 
separate Java thread. 

Event reporting functions are executed after certain events on the handheld 
or in its environment have occurred. For example, when the device the code 
stems from is leaving the communication range of the handheld or when new 
devices come into wireless transmission range, an event is triggered and the cor- 
responding method executed (cf. line 23 in Fig. 4). Event reporting functions 
are also used to handle callbacks registered on the shared data space. A hand- 
held participating in the space can specify tuple templates and register them 
with the distributed tuplespace. When data matching the given template is then 
written into the space, an event reporting function having the matching tuple 
as argument is invoked on the handheld. 

The last category of methods provided by the Smoblet class are methods for 
accessing the shared data structure. These come in three variants: functions for 
accessing the local data store, i.e., the local tuplespace; functions for accessing the 
tuplespace on a single remote device; and functions operating on the tuplespaces 
of a set of cooperating smart objects. The MicCollector Smoblet, for example, 
retrieves microphone tuples from all nodes participating in the shared data space 
(cf. line 17 in Fig. 4). Thereby, it relieves resource-restricted sensor nodes from 
storing microphone samples by putting them in its own memory. 



5 The Smoblet Front- and Backend 

The Smoblet frontend (cf. Fig. 5) is a graphical user interface for searching 
Smoblets provided by nearby smart objects, for adapting security parameters, 
for retrieving information about downloaded code and its execution state, and 
for manually downloading as well as starting Smoblets. If the user permits it, 
a handheld device can also serve as an execution platform for nearby smart 
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Fig. 5. The Smoblet Manager: the tool for searching Smoblets and restricting access 
to resources. 



objects without requiring manual interaction. In this case, the mobile user device 
continually searches for Smoblets on nearby devices. If a Smoblet wants to be 
automatically executed its code is then downloaded and started. 

Especially for Smoblets that offer a graphical user interface and therefore 
provide interactive services, the time needed to discover nearby devices can sig- 
nificantly reduce usability. This is especially true for Bluetooth because of its 
relatively poor device discovery performance (a Bluetooth inquiry often takes 
more than 10 s) [9]. As a possible solution, we propose an explicit selection 
mechanism based on passive RFID tags to determine the device address of a 
smart item. Thereby, a passive tag is attached to a smart object containing the 
device address of the BTnode integrated into that object. Also, a small-range 
RFID reader is attached to the mobile user device. A user can then explicitly 
select an object by holding the RFID reader close to it, thereby retrieving the 
corresponding BTnode’s device address. Having this information, the mobile 
code can be immediately downloaded to the handheld device and the graphical 
user interface be started. To experiment with such explicit selection mechanisms 
for triggering interactions with smart objects, we connected a small-size (8cm x 
8cm) RFID reader over serial line to a PDA. 

In contrast to the frontend, the Smoblet backend is responsible for provid- 
ing the actual runtime environment and access to the data structure shared by 
cooperating objects. It also handles the actual download of code, protects the 
handheld device from malicious code, regularly searches for new devices in range, 
and forwards events to Smoblets while they are executed. 

6 Evaluation 

In this section we evaluate our prototype implementation together with the un- 
derlying concepts. In particular, we discuss time constraints for downloading 
code, demands on the underlying communication technology, and the perfor- 
mance overhead caused by cooperating with multiple smart objects. 
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2 nodes, all tuples on remote node 
5 nodes, tuples on remote nodes 
5 nodes, tuples on single remote node 



(a) 



Number of result tuples 

(b) 



Fig. 6. (a) Time needed for downloading Smoblets from smart everyday objects; (b) 
time needed for a scan operation on the shared data structure. 



Fig. 6 a) shows the time needed for downloading code in the realized proto- 
type. It can be seen that the throughput achieved is about 37.4 kbit per second, 
which is relatively poor compared to the theoretical data rate of Bluetooth but 
compares well to data rates measured by other researchers in connection with 
Bluetooth-enabled embedded device platforms [7]. However, Bluetooth itself is 
not the limiting factor when downloading code from a smart object, but thread 
synchronization issues on the mobile user device. As code is usually downloaded 
in the background, several threads are executed simultaneously during the down- 
load: Java threads, threads belonging to the Bluetooth stack, threads that are 
used to continuously query for devices in range and for accessing the shared data 
structure. That Bluetooth is not the dominating factor can also be seen in Fig. 
6 a) because the time for download does not depend on the number of nodes that 
share a Bluetooth channel. Because of the TDD (time division duplex) scheme in 
which Bluetooth schedules transmissions and the fact that the Bluetooth mod- 
ules used apply a simple round-robin mechanism for polling nodes, an increased 
number of devices sharing a channel would imply decreased performance, which 
cannot be observed in our case. However, besides the relatively low throughput, 
even Smoblets that offer graphical user interfaces are typically downloaded in a 
few seconds. This is because class files and pictures are compressed before they 
are stored on a smart object. Typical compression rates for code are around 40- 
50 % but less for pictures because they are usually already in a compact format. 
For example, the interactive Smoblet presented in Section 7 has a code size of 
around 26.9 kB and a size of 15.2 kB after compression. Therefore it takes only 
about 3.5 s to download the code from a smart object to a handheld. 

As previously discussed, a core concept for integrating handhelds into sets of 
cooperating smart objects is to make the location where data are stored trans- 
parent for mobile code. Hence, as long as Smoblets search and exchange data by 
means of the distributed tuplespace, they can be executed on every node partic- 
ipating in that data structure without change. For example, in Fig. 4 line 17, the 
MicCollector Smoblet scans for all microphone samples in the distributed data 
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structure and thereby relieves all collaborating nodes from storing that data in 
their local tuplespaces. The same program could be theoretically executed on ev- 
ery device participating in the data structure, always having the same effect. Fig. 
6 b) shows the time needed for a scan operation on the distributed tuplespace 
with respect to the number of tuples returned and the number of cooperating 
smart objects. Thereby, all smart objects are members of a single piconet; the 
scan operation returns all tuples in the distributed tuplespace matching a given 
template. As can be seen, the performance for retrieving data depends on the 
distribution of tuples on the remote devices. This is also a consequence of Blue- 
tooth’s TDD scheme for scheduling transmissions. 

We would like to conclude this section with a discussion about the overhead 
caused by cooperating with collections of remote smart objects. In our approach 
only code and no data are shipped from a smart object to a mobile device during 
migration. This is because Smoblets usually operate not only on data from the 
smart object that provided the code, but on data from multiple cooperating 
devices. As can be seen in Fig. 6 b), tuplespace operations on multiple remote 
nodes are thereby almost as efficient as operations on a single device but offer 
the advantage of operating simultaneously on many objects. 



7 Applications 

There are two major application domains for Smoblets: (1) exploiting the com- 
putational resources of nearby handhelds in order to facilitate collaborative con- 
text recognition and (2) enabling graphical user interaction with smart objects 
by outsourcing user interfaces to nearby handhelds. In this section, we present 
one example from each of these two application areas. 

Handling large amounts of sensory data. In order to provide context- 
aware services, smart objects must be able to determine their own situational 
context and that of nearby people. This usually requires cooperation with other 
objects and the ability to process local sensor readings together with sensory 
data provided by remote nodes. A significant problem that often arises in these 
settings are streaming data, e.g., from microphones and accelerometers, which 
are difficult to exchange between and difficult to store on smart objects because 
of their resource restrictions. By using the proposed concepts, nearby handheld 
devices can help in handling large amounts of sensory data during the collabo- 
rative context recognition process. 

As an example, we have implemented a Smoblet for evaluating which smart 
objects are in the same room and for finding out what is happening at a spe- 
cific location. This is done by means of low-cost microphones attached to smart 
objects (for our experiments we have used the sensor boards described in [1]). 
When a user carrying a handheld device comes into the range of a smart ob- 
ject, it automatically transmits a MicCollector Smoblet (cf. Fig. 4) to the mobile 
user device, where it is automatically started. Smart objects continuously sam- 
ple their microphones at approximately 40 kHz and extract a feature from these 
readings indicating the level of activity in their room. In this example, we sam- 
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Fig. 7. Small subset of the data assembled by the MicCollector Smoblet reflecting the 
microphone measurements of four remote smart objects. 



pie microphones continuously for approximately 500 ms and use the number of 
crossings through the average microphone sample as feature. The MicCollector 
Smoblet running on a handheld device retrieves these features from all smart 
objects in range, thereby relieving them from storing these data. After several 
readings, the MicCollector Smoblet can then derive the location of smart objects 
(i.e., whether they are in the same room) and determine what is happening in a 
room. Because of the more powerful computational capabilities of handhelds it 
can thereby carry out more demanding algorithms for evaluating sensory data. 
Fig. 7 shows the microphone features collected by the MicCollector Smoblet from 
four devices. In the figure we have filled the area between sensor features from 
smart objects in the same room. As can be seen, the features of devices in one 
room are correlated, meaning that they decrease and increase simultaneously. 
This fact is exploited during the context recognition process on the handheld. 

Providing user interfaces. The presented approach for outsourcing com- 
putations can facilitate user interaction with smart objects, which usually do not 
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Fig. 8. (a) A BTnode with an attached heart rate sensor; (b) the user interface down- 
loaded from the BTnode to visualize the pulse data collected during the last training. 
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possess keys or displays. To show the applicability of our concepts in this area, 
we have augmented a heart rate monitor belt with a BTnode that records the 
electromagnetic pulses generated by the belt during a training (cf. Fig. 8). The 
BTnode also carries a Smoblet that can be downloaded to a handheld device. 
Thereby, a user interface is transmitted to the mobile user device that allows a 
sportsman to evaluate the last training. 

8 Conclusion 

In this paper, we presented an approach that enables smart everyday objects 
to spontaneously access the capabilities of nearby mobile user devices. Thereby, 
smart objects outsource computations to nearby handhelds and hence can dy- 
namically exploit their resources. A distributed data structure facilitates the co- 
operation with multiple smart items and enables handhelds to access remotely- 
generated sensor values. We have evaluated our concepts based on a concrete 
implementation, and identified two application domains - collaborative context 
recognition and graphical user interaction with smart objects - in which they 
prove to be valuable. 
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Abstract. Today, most mobile devices (e.g. PDAs) are in some way 
associated to a fixed personal computer or server. In general this relation 
is only taken into account for synchronization purposes. 

This is rather restrictive as, while away from these fixed computers, such 
mobile devices may require resources that are not available (e.g. network 
bandwidth, processing power or storage space). This lack of resources 
prevents the user from doing what he wants when he wants. 

We propose a system in which, by enabling automatic remote code exe- 
cution on a remote computer, these limitations are subdued. At run time 
it is decided whether some application code should run locally or on a 
remote computer. This is achieved using runtime meta-programming and 
reflection: we transform a centralized Python application so that some 
part of its code is run on another computer, where the needed resource 
is known to be available. This is accomplished without any manual code 
change. The performance results obtained so far, i.e. with no optimiza- 
tions, are very encouraging. 



1 Introduction 

With the advent of mobile communication technologies, the resources available 
to mobile devices are no longer constant. In a certain moment the connectivity 
and bandwidth can be low and in the next instant these resources can be widely 
available. This variation on the environment and how the application must react 
poses a challenge to the development of environment aware applications. Adding 
to the mobility of these devices we now have a large number of personal com- 
puters with highly available resources (disk, bandwidth, screen, ...); when the 
user is roaming with its mobile device such resources are available but not used. 

One kind of application that behaves independently of the environment re- 
sources is the download manager attached to browsers. In a portable device, 
while browsing a site, if the user wants to download a file, this action is always 
performed in the device. The browsing application, isn’t aware of nearby com- 
puters that may have a larger display or better connectivity. If these computers 
could be used in a transparent and automatic way, the user would benefit from 
it, in terms of speed and ease of usage. If the network bandwidth is limited or the 
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storage space is insufficient, a better approach would be to execute the download 
to a remote computer. 

For example, an application that displays or stores a file specified by its URL 
can benefit from the resources available in the remote computers. If the storage 
space in the PDA is not enough to save the file one should transparently save it 
in another computer. The same applies to the case where network bandwidth is 
reduced. If we are watching files on a PDA, the use of a large display is preferable. 
If a nearby computer has a larger display than the PDA we are running our 
application at, then the file should be presented in that remote computer. 

One challenge we see in such environments is not only to adapt the mobile 
computer applications to its environment but also to use the wasted resources 
available in the surroundings. Adding to this challenge there are other prob- 
lems: reconfiguration of the running objects when the environment changes, the 
automatic selection of the remote computer and issues concerning security and 
resource usage. 

In order to accomplish this, the first problem to address is how to split an 
application in order to run part of it in a different computer. The first solution 
that comes to mind, and the hardest, is to explicitly program each application 
to make it mobile with the device. Such solution makes the application mobility 
impractical due to the different scenarios one must address. 



1.1 Shortcomings of Current Solutions 

The implementation of an environment aware distributed application can be 
accomplished using ordinary remote procedure calls or RMI libraries. The code 
to handle the environment observation and the results of these observations must 
be implemented by the programmer. The selection of the objects that are to be 
run remotely must be known at development time and this is hard coded to these 
objects. The programmer must also know in advance how the object should be 
executed in the remote computer. 

When using some sort of agent API, handling the localization of the remotely 
executed object becomes easier. The agent libraries provide a way to let the 
programmer decide at a latter time where the code should run. 

These solutions have the drawback that the original code must be changed 
and that the various possible scenarios (resources to take into account, available 
remote machines, different application configurations, ...) must be addressed 
while coding the transformed application. 

By transforming the binary executable so that the resulting program can 
handle the environment changes, there is no longer the need to develop a new 
source version different from the original. This transformation tool must read 
the source code (or a binary version) and, in accordance to some configuration 
file, change the way the original code runs. These tools must insert the code 
that handle the environment awareness and transform the way the objects are 
created (remote versus local object creation) 

By doing this transformation at compile time, we get two different binary 
versions, but the original source code remains constant. This approach is better 
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than the explicit programming mentioned earlier but still has some drawbacks. 
As the new binary versions have features and use resources that may not be 
available in all platforms there is still the need to match the correct binary 
version to the platform and to the execution needs. 



1.2 Proposed Solution 

When the code transformation is done at run-time, the problem that arises 
from having several binary versions disappears. Each platform has a different 
transformer that runs when executing the application. The same binary version 
has different behavior depending on the existence of a code transformer. 

The system responsible for the code transformation, does not read the bi- 
nary file; instead, it intercepts the loading of the code, and changes the class 
representation that is then stored in main memory. 

With this approach the same binary can be ran using several transforma- 
tion tools, generating different final programs, without the need to administer 
different versions. 

By using metaclass programming we manage to intercept the class loading 
in an easy and straightforward way. We implemented a metaclass, defined it as 
the constructor of all loaded classes and made it responsible for the adaptation 
of the code being loaded. 

In the next section we present some technologies systems that address similar 
issues as our work (remote code execution, mobility and reflection). In Sects. 3 
and 4 we describe the architecture and implementation (respectively) of our 
system. Finally we show performance and functional evaluation as well as the 
conclusions and future work. 



2 Related Work 

A few years ago mobile computing was synonym of agent programming. Agent 
systems allowed the programmer to easily develop applications that would roam 
around several remote computers to perform a certain task. 

Today, with the development of wireless technologies and portable devices, 
mobile computing has a new meaning. Now the applications are mobile because 
the devices they are running on are mobile. To take advantage of the full potential 
of the resources available in the surrounding environment these applications must 
adapt the way they execute. 

The first approach to the adaptation of these mobile applications was to 
adapt the data transmitted to and from remote computers. The solutions pro- 
posed range from the development of specific proxies to the use of distributed 
middleware that handles the adaptation of the data transmitted. The proxy so- 
lution was first applied to the web contents [1] [2] and allowed the transformation 
of the contents so that its download time is reduced. In the other edge we find 
systems that allow the development of mobile application that interact with a 
data source. In the work done by T. Kunz[3] the communication is done by a 
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series of objects: some local to the mobile device and others in the data source 
that adapt the information transmitted according to the resources available. 

These solutions have the drawback that only the data is adapted to the 
device’s environment. Another approach to the adaptation of the application 
execution is the development of middleware platforms that are environment 
aware. These systems range from the generic system discovery such as Satin [4] 
or ICrafter[5] to the more specific work done by Nakajima[6] with a middleware 
for graphical and multimedia applications. These solutions are valid but solve 
the problem of constrained resources by invoking services on remote computer 
but not executing the applications original code on a more suited host. 

Some systems use reflection [7] [4] [8] to accomplish the transparent adaptation 
of the applications but its scope is the same as the systems described previously. 
Only logic mobility is performed, a task is executed in a remote host, but no 
application code moved to the remote computer, the service performed should 
be already present there. 

Also related to the work we try to accomplish there are some migratory 
applications systems . The work done by Krishna Bharat [9] allowed the develop- 
ment of graphical applications that could roam between several computers, but 
in a monolithic way. Harter et al. [10] by using VNC server, the interface of an 
application could be moved from a display to another, but the code remained 
running in the same server. 

The use of the Remote Evaluation Paradigm [11] overcomes the deficiencies 
of the previous systems by allowing the development of reconfigurable applica- 
tions^] with code mobility. There is no need to program where and when the 
objects should move to. The programmer must only develop the code taking into 
account that it will be mobile and state which requirements the remote host 
should satisfy. For instance, in Fargo[13] the programmer must develop a special 
kind of component: the complet. While developing these special components, 
the developer must program how the objects will roam among the computers: 
environment requirements, object aggregation, ... After the development of these 
components the application must be compiled so that the mobile code is inserted 
in the application. 

Our work also relates to projects such as Javaparty[14] or Pangaea[15], in the 
sense that the distribution of the objects and the decision on where they should 
run is hidden from the programmer. In JavaParty the system hides the remote 
creation of objects and its mobility, but the programmer must tag these with a 
special keyword. A separate compiler is needed. In Pangaea, a special compiler 
analyzes the source code and transforms it so that some objects are created in 
remote hosts. 



3 Architecture 

The system must accomplish three different tasks in order to automatically trans- 
form a centralized application: load the application requirements, transform the 
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mobile classes according to the requirements and allow the remote execution and 
mobility of the code. 

In our system, these tasks are executed by three different components (shaded 
in Fig. 1): the Application Requirement Loader, the Code Transformer 
and the Remote Execution engine. 




Fig. 1 . System Architecture 



In order to make a class mobile, during the execution of the application a 
configuration file must be present. This file states what classes are mobile and 
what their requirements are. The Application Requirement Loader reads 
the configuration file and stores the requirement rules associated to each class. 
From the information stored in the file, this module builds the rules that state 
whether a certain class instance should run locally or remotely. This module is 
also responsible for loading the code of the probes that observe the surrounding 
environment. 

The Code Transformer module is responsible for transforming the classes 
that may run locally or remotely (those referred in the configuration file). If 
the Application Requirement Loader module knows about the class being 
loaded, the Code Transformer module changes the application code so that it 
is possible to create remote instances of that class. This module, besides changing 
the class code, also attaches to it the rule that should be evaluated when creating 
objects. 

The Remote Execution module, runs both on the mobile device and on 
the remote computer. This module is responsible for the creation of the remote 
objects and the communication between the mobile device and the remote com- 
puter. This module must also upload the code into the remote computer if it 
was not previously installed in the remote computer. This module is contacted 
during the execution of the application whenever there is an object that should 
be created remotely and when these remote objects should execute any method. 

The Environment Aware Classes must be supplied to our system at run 
time. Theseclasses must observe the surrounding environment (network band- 
width, display size, ...) and inform the application if the requirements are met. 
These classes must comply with a certain interface that will be described later 
(Sect. 4.2.) 
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3.1 Execution 

The steps in the loading and transformation of an application are the ones shown 
on Fig. 2. 





Fig. 2. Application transforming 



Fig. 3. Object creation 



Even before the loading of the application code, the Application Require- 
ment Loader is created. If a configuration file (app . xml) exists it is read. From 
the reading of the configuration file, the rules are created, stored and associated 
to the corresponding class. In order to later evaluate the environment, the En- 
vironment Aware Classes code is also loaded and it is instantiated. These 
first steps are performed by the Application Requirement Loader module 
as described in Fig. 2. 

After the reading of the rules, the real transformation of the application is 
performed by the Code Transformer. After a class code is read, the Code 
Transformer module checks the existence of a rule associated to that class. If 
the class objects are to be mobile, the class code is transformed and the rules 
are also attached to it. This code and rules will be responsible for the evaluation 
of the environment and the creation of the object in a remote computer. 

The code that is inserted into the transformed classes is executed when cre- 
ating new objects as shown in Fig. 3. 

If a certain doesn’t have an evaluation rule associated (is not mobile), the 
object creation code was unchanged so its instances are always local. 

If the class was transformed the associated rule is evaluated. This evaluation 
states if the local device has enough resources to execute the objects or if the 
object must be created on a remote computer. 

If it was decided that the object can run locally, a normal local object is 
created. Otherwise, an instance of the class is created on a remote computer and 
on the local device a proxy to this new object is created. This proxy will replace 
a local object and forward the local method calls to the remote object. 

The remote objects and its proxies are created in the context of the Remote 
Execution module. 
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4 Implementation 

In the development of our system we used the Python [16] language. We took 
advantage of its portability and dynamic nature and the presence of introspection 
and reflection mechanisms on all tested platforms. There are fully functional 
Python interpreters to most desktop and server platforms and to most portable 
devices (Windows CE, QNX and Linux). In order to transform in run time 
the application we used the python reflection built-in mechanisms and to make 
communication between computers possible we used the Pyro[17] package. 



4.1 Application Requirement Loader 

The Application Requirement Loader module reads a configuration file, and 
from the information present in that file, assigns each class a rules that states 
whether its instances should run locally or remotely. The file is written in XML 
(Fig. 5), whose DTD is described in Fig. 4. 



<! ELEMENT program (name, class*) > 

<! ELEMENT class (name , host , expr) > 

<! ELEMENT expr (decis I and I or I not)> 



<! ELEMENT decis 
<! ELEMENT and 
<! ELEMENT or 
<! ELEMENT not 
<! ELEMENT host 
<! ELEMENT config 
<! ELEMENT name 



(name, config) > 

(expr+)> 

(expr+)> 

(expr) > 

(name) > 

(#PCDATA)> 

(#PCDATA)> 



<program> <name> appl </name> 
<class> <name> clssO </name> 
<host> <name> hostl </name> 
</host> 

<expr> <decis> 

<name> Bandwidth </name> 
<config> <more>1000</more> 
</conf ig> 

</decis> </expr> 

</ class></program> 



Fig. 4. Configuration file DTD 



Fig. 5. Configuration file 



The classes that have some sort of environment requirement will have a rule 
in this file. These rules are written with the usual logical operators (OR, AND, 
NOT). After reading these rules a tree like structure will be generated and stored, 
so that they can be latter evaluated. 

This module stores the rules in a hash-table, so that, during the loading of 
the classes (executed by the Code Transformation Module) these rules can be 
attached to the classes referred in the configuration file. 

4.2 Environment Aware Classes 

When evaluating the rules previously stored, there is always the need to evalu- 
ate the surrounding environment. This is stated by the decision element that 
appears in the XML configuration file. These nodes are the leafs of the rules. 

These classes have only requirements: the constructor must receive a piece o 
XML that complies with its own definition and there must exist a method named 
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decide that returns true or false according to the sensors measurement and 
the information on the XML code. 

The code that interact with the sensor that observes the surrounding environ- 
ment must be supplied in the decide method and is executed when evaluating 
the rules. When the instances of these classes are created and attached to a rule, 
it is passed a XML snippet that informs the object about the requirements the 
application has. This XML code must comply with a certain DTD and defines 
the requirements the application has. 

When evaluating the environment this object knows exactly what the applica- 
tion needs to be executed returning true or false whether a certain requirement 
(stated in the XML code) is met. 

4.3 Code Transformation Module 

This Code Transformation Module intercepts all class code loading and decides 
whether the class should run unmodified or not. This module uses the informa- 
tion generated by the Decision Making Module to know if a class must be 
transformed. The class loading interception is accomplished using a customized 
metaclass. 

The classCreator metaclass (Fig. 6) is responsible for the interception of 
the program’s classes loading. 



1 class classCreator (type) : 

2 def buildClass (ClassName) : 

3 if (name in classCreator . remClss) : 

4 oldclass = type .buildClass (name+" old" ) 

5 replaceClass = type .buildClass (name , remoteCodeCreator) 

6 replaceClass . originalClass = oldclass 

7 replaceClass . server = classCreator . remClss [name] 

8 return replaceClass 

9 else: 

10 return type .builtClass (name) 

Fig. 6. classCreator metaclass pseudo-code 



When building a class, this code checks if the class being built was referred 
in the configuration file (Line 3) by looking at the remClss hash-table. This 
hash-table was previously built and populated while reading the configuration 
file (Sect. 4.1). If its name is not present in the hash-table, this class is built 
normally by the system default metaclass type (Line 10). 

If the class being built was referred in the configuration file, this metaclass 
must replace it for a proxy class (Lines 4-8 in Fig. 6). The first action is to 
store the original class (Line 4), so that later it can be accessed to create local 
or remote objects. Next an instance of the remoteCodeCreator class is built 
(Line 5) and the original code and the evaluation rules are passed to it (Lines 6 
and 7). This replacement class (remoteCodeCreator) is shown in Fig. 7. 
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1 class remoteCodeCreator (object) : 

2 def buildObject (this , *args) : 

3 decision = this . server [’rule ’]. decide () 

4 if decision == False: 

5 obj = this . originalClass . createObject () 

6 else: 

7 hostname = els . server [ ’name ’] 

8 URI=genURI (hostname) 

9 proxy = getAttrProxyForURI (URI) 

10 cdURI = proxy . createObject (this . originalName , args) 

11 obj = getProxyForURI (cdURI) 

12 return obj 

Fig. 7. remoteCodeCreator replacement class 



When creating an object that may run remotely, instead of running the 
original constructor, it is the code present in the remoteCodeCreator that runs. 
The rule associated to the class is evaluated (Line 3) and if the local device has 
enough resources to satisfy the rule, the result is negative and a local object is 
created (Line 5). In the opposite case a remote object must be created in the 
personal computer associated to this program. To create the new object in the 
remote computer, a connection to the Remote Execution Module present 
in the remote computer is made (Lines 8 and 9). Next, a remote instance of 
the original class is created (Line 10) along with its proxy (Line 11). Either 
a local object or a remote object proxy is returned in Line 12. This proxy is 
transparently created by the Pyro System and implements the same interface 
as the original class. From this point forward our original application interacts 
with a remote object by calling methods from the proxy without any change in 
the original code. 



4.4 Remote Execution Module 

The Remote Execution module runs part in the fixed computer where the 
remote code will execute and another part in the mobile device. 

In the server side, this module is composed of a service that receives requests 
from the clients to create new objects, these new objects are then registered and 
made available to be remotely called. 

The creation of remote objects is similar to other remote method execution 
systems, first a proxy to the server is obtained and the calls are forwarded to 
the located object. When requesting the creation of a remote object (Line 10 
of Fig. 7), if its code is not present in the remote computer, the Pyro server 
automatically downloads the necessary code to create and execute the objects. 
After the code is loaded and the object is created, this new object is registered 
as a new server in order to receive the call made in the original application. 

On the original application side, after the remote object creation, the URI 
returned is used to create a new Pyro proxy to the remote object. From this 
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point forward the call made supposedly to a local object is redirected by the 
Pyro proxy to the remote object stored and available in the server. 



4.5 Execution Overview 

In order to use this system and allow the automatic remote execution of the 
code, there isn’t any need to change the original code, nor the Python code 
interpreter. Instead of calling the application the usual way (python app.py) 
one only needs to prepend our system file: pyhton codetrans.py app.py. One 
can also configure python so that our system is always loaded whenever a python 
program runs. 

This codetrans . py file is responsible for bootstrapping our system by loading 
the configuration file associated with the python program and registering the 
classCreator metaclass (Fig. 6) as the default metaclass. After these initial 
steps the original python file is loaded, transformed as described in Sect. 4.3 and 
the application starts executing. 

In order to create objects in remote computers, the daemon responsible for 
it must be running on those computers. 

In this initial prototype the selected remote computer is hard coded in the 
XML configuration file. The environment aware classes must be coded so 
that we can evaluate the resources available: display size, disk capacity, network 
bandwitlr. 

5 Evaluation 

In order to evaluate the possible uses of our system, we developed two test 
applications. One application reads a URL from the keyboard and downloads 
that file to the hard disk. The second one receives also read a URL, downloads 
the correspondent text file and shows it in a graphical window. 

By using our system we managed to execute part of the applications (user 
interface) in a Compaq IPAQ and execute the other part in a remote computer. 
In the first test the code responsible for the download of the file runs in the 
remote computer. In the second test we managed to run the download code 
and open a new graphical window on a remote computer. These distributed 
applications were executed without any change to the original code. 

In order to evaluate the overhead incurred by our system, we developed a 
series of microbenchmarks that allows to measure the time spend in each stage of 
execution. Next is the description of the potential overheads and the applications 
used to measure them: 

Bootstrap. In this test we measure the time to execute a simple application 
that only prints a message on the display. We managed to measure the 
bootstrap of our systems without loading any XML configuration file. 

Rule loading. With this test we measure the time to load 100 rules present in 
configuration file. The rule provided to each class was shown in Fig. 5. 
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Class loading. The test program used to evaluate the time spent in the class 
loading, loads 100 classes that don’t have any rule associated. 

Class transforming. This test is similar to the previous one, but the loaded 
classes have a rule associated, allowing us to measure the time to transform 
a class. 

Rule Evaluation. This final test loads 100 classes and creates one object of 
each class. To measure the rule evaluation time, the rules are evaluated but 
the objects are created locally. 

These test applications were run on a Apple Ibook with an 800MHz processor 
and 640Mb of memory. We used a version 2.3 Python interpreter running on 
MacOsX. The results are presented in Table 1. 



Table 1 . Mac OS X execution times (sec) 





unaltered Python 


adaptation system 


bootstrap overhead 


0.12 


0.50 


rule loading 


0.12 


0.96 


class loading 


0.22 


0.60 


class transforming 


0.22 


1.14 


rule evaluation 


0.22 


1.18 



From the values shown in Table 1 we conclude that our system overhead 
comes from the bootstrap (about 0.38s) and from the XML rule loading (about 
0.46s for 100 rules). All the other tasks have a minimal impact on the execution 
time. The loading of 100 classes instead of taking 0.10s takes 0.18s, when the 
classes are transformed. The evaluation of the rules takes a minimal time to 
perform (about 0.04s). 



6 Conclusions 

We developed a system that allows us to experiment the possibility to create a 
reflective platform to dynamically adapt applications depending on the resources 
available. This platform, without any application source code change, modifies 
the way an application behaves. With the inclusion of configuration files the 
programmer can make the objects of a class remote. When creating these objects, 
depending on the resources required by the objects and available at the device, 
these objects are created locally or on a remote computer. These adaptations 
are made with minimal loss in performance. 

We now have a system that allows ordinary applications to use the best 
resources available around a mobile computer. To use these resources the pro- 
grammer neither has to change the source code of the application nor handle 
multiple binary versions. 
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This experiment with application adaptation is promising. Using this plat- 
form we can now work on the development of a transparent and automatic 
application reconfiguration system using mobile objects. 
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Abstract. In this paper we discuss research work that enables the development 
of mixed societies of communicating plants and artefacts. PLANTS is an EU- 
funded Research and Development project, which aims to investigate methods 
of creating “interfaces” between artefacts and plants in order to enable people 
to form mixed, interacting (potentially co-operating) communities. Amongst 
others the project aims to develop hardware and software components that 
should enable a seamless interaction between plants and artefacts in scenarios 
ranging from domestic plant care to precision agriculture. This paper deals with 
the approach that we follow for the development of the homonymous system 
and discusses its architecture with special focus on describing the communica- 
tion among artefacts and plants and on designing an ontology that provides a 
formal definition of the domain under consideration. 



1 Introduction 

The vision of Ambient Intelligence (Ami) [1] implies that technology will become 
invisible, embedded in our natural surroundings, present whenever we need it, enabled 
by simple and effortless interactions, accessed through multimodal interfaces, adaptive 
to users and context and proactively acting. In the future, it is envisaged that the 
spaces we live in will be populated by many thousands of everyday objects with the 
ability to sense and actuate in their environment, to perform localised computation, 
and to communicate, even collaborate, with each other. These objects are identified as 
artefacts and are playing a large role in research towards intelligent systems and ubiq- 
uitous computing. The Ami environment can be considered to host several Ubiquitous 
Computing (UbiComp) applications, which make use of the infrastructure provided by 
the environment and the services provided by the objects therein. 

An important characteristic of Ami environments is the merging of physical and 
digital space; major research efforts are currently targeting the “disappearance” of the 
computer into the fabric of our environment [2], However, currently there are only 
few discussions on including elements of real (natural) environment into UbiComp 
applications. In this paper, we present our efforts to create digital interfaces to nature, 
in particular to selected species of plants, enabling the development of synergistic and 
scalable mixed communities of communicating artefacts and plants. 
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Plants are ubiquitous in the sense that they exist almost in every environment 
populated by humans. They easily adapt to changing conditions and they can be used 
as sources of complex information on the environment. Our approach builds upon 
previous experience gained in the e-Gadgets project [3], where ubiquitous computing 
applications are composed from everyday objects under the Gadgetware Architectural 
Style (GAS) [4], We have extended this concept to regard plants as real-world “com- 
ponents”, which can communicate with artefacts in the digital space by using com- 
patible metaphors. To this end, we provide each plant with a GAS-compatible de- 
scription of its properties and state. 

In order to create synergistic mixed communities of communicating artefacts and 
plants, an interdisciplinary research effort is required involving plant science domain 
knowledge, sensory hardware engineering and ubiquitous system software engineer- 
ing [5]. Research issues include the selection of the “right” sensors with respect to the 
data and sensitivity, the fusion of sensed data and its semantic characterization, the 
synchronization of the entire artefact-plants system, the integration of sensors, actua- 
tors, artefacts with decision making procedures etc. 

The rest of the paper is organized as follows. Section 2 outlines an overview of the 
PLANTS system where basic concepts and components are identified and described. 
In section 3 the architecture of the system and design decisions are discussed. The 
role and the design of the PLANTS ontology are presented in section 4. Section 5 
discusses scenarios that have been examined for showcasing the developments. Sec- 
tion 6 examines related work and emphasizes on the different perspectives of our 
work. Finally, section 7 concludes this paper. 



2 Overview of the PLANTS System 

We are motivated from the fact that plants are truly ubiquitous entities, since they 
exist in everyday environments. In addition, plants can be used either as “biosensors” 
or as a “natural” and beautiful interface to services. In the former case, the context 
provided by digital sensors or artefacts can be enhanced with data of biological na- 
ture, which, if fused together, can trigger a more contextually accurate response; in 
the latter case, artefacts can use plants as a front end to delivering specific services 
(i.e. greetings, narration, alarms, etc). Thus, plants, if turned to ePlants, can become 
part of a range of UbiComp applications ranging from agricultural to domestic. 

A mechanism for low-level context acquisition, which reads plant signals from 
sensors is the first step in the plant context management process (Fig. 1). This infor- 
mation is probably not in a format that can be used by a system in order to make 
decisions or reach a conclusion. In a second step the plant signals are interpreted and 
a high-level context information is derived. For example, a sensor that measures the 
temperature of a plant may use a metric system that is different than the Celsius sys- 
tem. In that case the interpretation simply maps the sensor readings to the Celsius 
metric system and the information can be displayed to a device if we choose so. Ag- 
gregation of context is also possible meaning that semantically richer information 
may be derived based on the fusion of several measurements that come from different 
homogeneous or heterogeneous sensors located on plants or artefacts. To determine 
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photo-oxidative stress, for example, requires monitoring of chlorophyll florescence 
and ambient light level signals to adjust supplementary light. As another example, 
determining water stress requires monitoring of plant’s leaf temperature, ambient 
temperature and humidity level [5]. 




Fig. 1 . Plant/environmental context management process 

Having acquired the necessary context we are in a position to assess a state of the 
plant and decide appropriate response activation. Adopting the definition from Artifi- 
cial Intelligence, a state is a logical proposition defined over a set of context meas- 
urements [6]. This state assessment will be based on a set of rules that Plant Scientists 
have researched and which the PLANTS system has to encode in a flexible and ex- 
tensible manner. The low (sensor) and high (fused) level data, their interpretation and 
the decision-making rules are encoded in an ontology. 

The reaction may be as simple as turn on a light or send a message to the user or a 
composite one such as request watering to the pot in case of drought stress or as 
spraying mist in case of heat stress, which means that the system has to differentiate 
between the two kinds of water stress. Such a decision may be based on local context 
or may require context from external sources as well, e.g., weather station supporting 
prediction of plant disease spreading. 



System Components 

A mixed society of plants and artefacts can be regarded as a distributed system, which 
will globally manage the resources of the society, its function(s) and its interaction 
with the environment. The PLANTS system is schematically represented in Fig 2; the 
UbiComp applications, which use mixed societies of ‘ePlants’ and ‘eGadgets’, are 
referred to as ‘bioGadgetworlds’ and are composed of a number of basic components: 
ePlants: Plants are transformed into ‘ePlants’ through the superimposition of a 
technological layer and may represent either a specific plant or a set of plants (a set 
may be defined in terms of specific plant species or a number of plants in a particular 
location). The scope of the system enables groups of ePlants to be organized into a 
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large number of nodes, to create a hierarchical structure that evenly distributes the 
communication load and other resource (power, memory, computation) consumption 
and also facilitates distributed decision-making. These nodal groupings are consid- 
ered as ‘high-level’ ePlants with the hierarchical structure adhering to the philosophy 
of the software component-engineering paradigm where composite components are 
synthesized from simpler ones. 



Fig. 2. An example of a bioGadgetworld, the functional operation of which defines a distrib- 
uted system. 

eGadgets/ Actuators: eGadgets are GAS-enabled artefacts that may represent ex- 
pressive devices (speakers, displays, mobile phones etc.), resource-providing devices 
(e.g. lamps, irrigation/fertilization/shading system) or any other everyday object. An 
actuator is part of the physical interface of a tangible object and actuator systems are 
mainly related to eGadgets nodes. The actuator systems will allow the plant to influ- 
ence the environment that it resides in. 

Sensor Systems: Sensor systems range from COTS devices to future microsensor 
networks, involving off-plant (non-contact) sensors. Individual sensor devices may be 
shared among a number of ePlants (e.g., due to cost constraints) so that the context 
needs to be determined. 




Legend 




interaction 

communication 



sensory 

communication 
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Decision-Making 

The interaction of eGadgets and ePlants entails the triggering of local decision- 
making and global decision-making procedures. Upon determining the local state of a 
plant a decision may be required for the action to be followed. In the case of an 
eGadget (e.g., a lamp or a valve) the local decision-making (or resource management) 
mechanism resolves conflicts when multiple ePlants request a common resource (e.g., 
light or water). Distributed mechanisms can also be considered to alleviate similar 
situations, when ePlants and related eGadgets are coordinated for detect- 
ing/maintaining a global state/objective in the context of a group of distributed nodes 
(for example, is the growth of all plants in a domestic setting equally serviced? Or, is 
the whole field suffering a disease or a specific portion of it?) 

The global state regarding a group of nodes will be a reasoning function based on 
the local states that are being monitored by the system. The nodes of the system have 
to communicate in order to exchange data. If we consider a group of nodes, a node is 
elected as ‘coordinator’ and gathers all the data that is needed in order to assess the 
global state. The algorithm is fully distributed so that in the case that the coordinator 
“goes off’, another node will be elected as the coordinator node. The global state may 
not be as accurate as before but still it will be evaluated. When a global state is per- 
ceived then a global decision-making can be made/suggested. 



Sensory / Interaction Communication 

The communication in the distributed system is divided into two levels: sensory 
communication, which refers to the communication between a node and its sensor 
systems and the interaction communication, which refers to the communication be- 
tween the nodes of the system. 

We aim to separate the low-level sensory communication part from the interaction 
part. The former contains the sensors that transform chemical signals to digital and 
vice-versa and defines the lower layer of the distributed system. The latter one is used 
to conduct artefact / plant interaction which defines the higher layer of the distributed 
system. In that way, we separate the interaction services from the context of applica- 
tion (as defined by the sensor network communication). 



3 System Architecture 

A mixed society of communicating plants and artefacts forms a distributed system 
whose nodes are permeated by the illustrated architecture (Fig. 3). We have designed 
a layered modular architecture as to achieve a number of objectives, which would 
make the system more flexible and extensible. 

At each layer, we have well-defined protocols that provide access to useful serv- 
ices: sensor data access, local state assessment, resource discovery, and so forth. At 
each layer, APIs may also be defined whose implementation exchange protocol mes- 
sages with the appropriate service(s) to perform desired actions. 
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At the heart of the architecture lie the middleware software layer (ePlantOS for 
ePlants / eGadgetOS for artefacts) that supports the deployment of mixed society 
applications by managing logical communication channels, called synapses, between 
the nodes of the distributed system. Both ePlantOS and eGadgetOS are sharing a 
common interaction module, understand basic concepts, and communicate by using 
common protocols and message structures. The functionality and services of the mid- 
dleware is extended to fulfill the new requirements determined by the foreseen mixed 
societies and thus provides an interface with plant sensors and actuators, maintains an 
enhanced, plant-specific ontology and supports resource management and decision- 
making. 

The I/O unit and connectivity layers administer the communication intricacies 
(e.g., commercial of the shelf sensor device communication protocols, routing proto- 
cols, etc.) in terms of the sensory and interaction communication views respectively 
of the system component. 




Fig. 3. System node architecture 

A modular system design allows the replacement of a module without affecting the 
functionality of the rest provided that the APIs between them remain consistent. This 
principle holds for the different layers of the architecture as well as within each layer. 
The modular design of ePlantOS, for example, allows the integration of up-to-date 
algorithms and protocols in the form of plug-in modules. 

We have decoupled the low-level sensory communication that gathers raw data 
from sensors from the application business logic, which we may consider, that is 
captured as a hierarchy of rules. In the same way we have separated the wireless 
communication networking issues from the application communication requirements. 
In other words our applications do not depend on a specific sensor or a protocol that 
this sensor uses to transfer data values. In that way, new sensor devices that emerge 
for a selected plant parameter can be integrated to the system without disturbing the 
other modules. In the same manner a different wireless communication protocol can 
be used without affecting the application business logic. Thus, we achieve technol- 
ogy-independence and adaptability. 

In our approach, an application is realised through the cooperation of nodes of the 
distributed system in the form of established logical communication links between 
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services and capabilities offered by the artefacts and the states and behaviours in- 
ferred from the plants (in each case services/states are provided through access points 
called plugs). The plug/synapse model [7] provides a conceptual abstraction that 
allows the user to describe mixed society ubiquitous applications. To achieve collec- 
tive desired functionality, one forms synapses by associating compatible plugs, thus 
composing applications using eGadgets and ePlants as components. The use of high- 
level abstractions, for expressing such associations, allows the flexible configuration 
and reconfiguration of mixed society applications with the use of appropriate editing 
tools. 

In Fig. 4 we depict a simple mixed society of an ePlant associated with an eGadget 
(eLamp), so that when the chlorophyll fluorescence signal of the plant is below a cer- 
tain level, implying a photo-oxidative stress situation, the light of the lamp must be 
turned on to a specific level of luminosity, until the chlorophyll fluorescence level 
rises up to a normal level again. A synapse has been formed between the Phot- 
oxidative Stress plug of the ePlant and the Light Intensity plug of the eLamp. The 
interaction module that implements the plug/synapse model is compatible between 
the two components and thus their interaction is feasible. 




Fig. 4. An example of ePlant/eGadget interaction 

In step 1, the biosensor/bioactuator network transforms selected plant (chlorophyll 
fluorescence) or other environmental (ambient light intensity) signals into digital 
signals. In step 2, ePlant’s I/O Unit reads the digital signals (sensory communication), 
which will then be interpreted to a high-level unit of information, for instance, to an 
aggregated composite signal, and this information is transferred to the middleware 
(ePlantOS). In step 3 the context received by the middleware is applied to the rules 
encoded in the ontology so that a state of the plant is determined. Then the decision 
for an action may come in the form of a command or a request for a service. The 
ontology of the ePlantOS may specify, for example, the luminosity of the requested 
light, based on the plant species at hand. In step 4, the middleware passes the infer- 
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mation to the connected eGadget through the established logical channel (synapse). 
The connectivity and wireless communication layers implement the lower layers of 
the network stack. Finally in steps 5-7 the middleware of the eGadget receives the 
information and acts upon the eGadget by using the eGadget I/O unit that in turn 
activates an actuator through the sensor/actuator network. 



4 PLANTS Ontology 

The PLANTS system is designed so that it enables the semantically meaningful inter- 
action between plants, artefacts and people via conceptualization of plants domain 
knowledge. The PLANTS ontology will represent the necessary knowledge in order 
to meet the PLANTS system’s requirements and support its functionalities. We de- 
signed the PLANTS ontology keeping in mind two key issues described in the next 
paragraph; the first one is relevant to the way that we will exploit and use this ontol- 
ogy and the second one is relevant to the knowledge that this ontology should repre- 
sent. 

The PLANTS ontology will accommodate the semantic interoperability among 
heterogeneous ePlants and eGadgets by providing to them a common language. Thus 
the PLANTS ontology’s first goal is to provide a formal representation of the domain 
under consideration, the bioGadgetWorlds (bioGWs). This representation demands 
the identification and the semantic description of the basic terms and concepts of the 
bioGWs as well as their interrelations. The basic concepts of the bioGWs are the 
following: eGadget (eGt), ePlant, Plug, Synapse, bioGW, Sensor, Actuator and Pa- 
rameter. In the PLANTS ontology these concepts are represented as different classes, 
which have a number of properties. An eGt is a GAS enabled artefact. The properties 
of eGts are divided into two categories: the physical properties like shape, material, 
etc. and the digital properties like its plugs. As an eGt/ePlant exposes its services 
through plugs, a Plug represents the eGt/ePlant’s capabilities. The connection be- 
tween two Plugs is represented by a Synapse. A set of relations between these con- 
cepts represented in the PLANTS ontology are the following: an eGt/ePlant has 
Plugs, a Synapse is formed by exactly two Plugs, a bioGW contains at least two 
eGt/ePlants and a Synapse must exist between their plugs. In Fig. 5 we present the 
basic classes defined in PLANTS ontology and the associations between them. 

One of the most important stages during the design of the PLANTS ontology was 
the definition of the ePlant. The digital-self of an ePlant doesn’t differ to the one of an 
eGadget; thus the key issue is the representation of the physical-self of an ePlant, 
which is a specific plant or a group of plants. Each ePlant has a unique eEntityld, but 
the element that characterizes it is its name, which contains genus and species. Some 
other properties relevant to the plants that the ePlant contains are the following: the 
color and the type of its leaves, the color and shape of its flower, its developmental 
stages, its possible states, its symptoms and diseases and its stresses. 
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Fig. 5. Basic classes defined in PLANTS ontology and their associations 



Another important role of the PLANTS ontology is to support both the local and 
global decision-making process. The decision-making process will be based on a set 
of rules in operational representation forms, which will be applied on existent knowl- 
edge and allow the use of the PLANTS ontology for reasoning providing inferential 
and validation mechanisms. The reasoning will be based on the definition of the 
PLANTS ontology, which may use simple description logic or user-defined reasoning 
using first-order logic. The PLANTS ontology should represent both the necessary 
knowledge and the appropriate rules. The knowledge that the PLANTS ontology 
needs in order to support the determination of the plant’s state is emerged from the 
characterization of a plant and is strongly connected to the values of the plant pa- 
rameters that are measured. So the plant parameters’ definition and description are 
represented into the PLANTS ontology. Note that at an ePlant’s definition there is an 
association between a plant parameter and at least a sensor. For the characterization 
of a plant’s state and the decision-making process, the incorporation of environmental 
parameters into the PLANTS ontology will be helpful. Note that in order to have a 
complete view of a plant’s characterization it is necessary to define the threshold 
values and/or the range of values for both plant and environmental parameters for 
different plant species and the states that these values imply. 

PLANTS ontology refers to setting up a concept framework on how the knowledge 
about sensors, actuators and systems available on one hand and the biological studies 
about plant stressing and sensing mechanisms and consequent plant behaviour on the 
other hand can be formalized in order to make plants an active part of our ambient 
intelligence. The decision-making process based on the sensing of plants is also 
structured for the selected set of sensors and actuators and the correlated biological 
information allowing to interpret the plant behaviour. 
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5 Mixed Society Application Examples 

We aim to evaluate the development of the PLANTS system in a number of proof-of- 
concept scenarios that range from domestic to outdoors UbiComp applications in- 
volving plants. 

Discussions relating to the development of a scenario for the evaluation of the 
PLANTS system identified a number of categories relating to the interaction between 
plants, artefacts and people as follows: 



Crop automa- 
tion 


This type of scenario outlines the practical application of the 
PLANTS technology for more sustainable agriculture and describes 
plant signals being utilized to actuate the rate of individual agricul- 
tural inputs (for example water and fertilizer). 


Plant- 

environmental 

monitoring 


The status of a population of plants, either a monoculture or a mixed 
population, could be envisaged to provide feedback to people on the 
environmental conditions of a particular area. 


Plant expres- 
sion to man 


The devices around plants enable the communication of plant status, 
in an abstract (i.e. changes in light, or the level of sound) or a more 
explicit form (more detailed readings of the plant physiological 
requirements). 


Peer plant and 
man 


This scenario categoiy includes ideas that humanize plants. One 
person is associated with one plant through a communication link, 
whereby the plant status affects the person’s environment (at home or 
by some means of wearability). 



Our collaborating work is towards creating two prototypes; one related to crop 
management and one related to an in-door scenario. The former will display how 
technological advance can promote and increase sustainability in agriculture and 
horticulture through the controlled monitoring of the plant (rather than of the physical 
environment only), to eliminate for example blanket pesticide treatments, up to ex- 
cess water usage. The latter, involves the combined presence of plants and artefacts 
into an environment inhabited by humans (e.g. home, office, or public recreation 
spaces, etc) by exploiting plant signaling properties. A simple scenario for example is 
that when a plant feels “sick” or “hungry” it can notify the people at home/work or in 
a city park, trough spoken messages to their being taken care of. At home settings the 
plant itself can take actions for its well-being e.g., turning on/off the light or opening 
the window under certain conditions. 



6 Related Work 

The practical issues of building a UbiComp application, called PlantCare, that takes 
care of houseplants using a sensor network and a mobile robot are investigated in [10]. 
The emphasis is given on discussing technical challenges encountered during the de- 
ployment of the application. Our approach in contrast emphasizes the development of 
an architecture that views plants and associated computation as an integral part and 
allows the interaction of plants and artefacts in the form of synergistic and scaleable 
mixed societies. An ontology-based conceptual model is defined for composing Ubi- 
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Comp applications which ensures a balanced behavior both to ambient nature appli- 
cations where interactions through high level concepts and user empowerment is the 
focus, and agricultural nature applications where the integration of a large number of 
plant and environmental sensors and the complexity of the communication and the 
decision-making processes are the focal points. 

LaughingLily presented in [11] is an artificial flower that is used as an ambient 
display. We support the same concept of using plants as a beautiful “encalming” 
interface or as a metaphor to convey information in a natural and unobtrusive manner. 
In the same time, the biological aspects of plants (e.g., photosynthetic analysis) can be 
equally brought forward for educational purposes or for environmental awareness. 

Attempts to use environmental sensor networks in order to improve crops are re- 
ported in [8, 9]. The approach taken is specific to crop management application 
whereas our approach strives to cover the ambient nature of applications equally. The 
later reference uses a centralized architecture to gather data followed by an analysis 
phase so that a grower becomes able to examine crop conditions in trial-and-error 
regime. This is in contrast to our distributed management system that allows both 
local and distributed decision-making approaches. 

In the biology, botany, organic computing and bioinformatics domains there are 
activities on building ontologies [12,13,14] that partially addressing principles of 
PLANTS. These activities aim to develop and share structured controlled vocabular- 
ies for plant-specific knowledge domains like plant anatomy, temporal stages, genes 
and biological sequences. Central to our approach is the use of an ontology, which 
provides not only a conceptual description of the domain knowledge, but furthermore 
the use of rules and constraints (axioms) in operational representation forms allow the 
use of the ontology for reasoning providing inferential and validation mechanisms. 
The reasoning is based on the definition of the ontology, which may use simple de- 
scription logic or user-defined reasoning using first-order logic. 



7 Conclusions 

In this paper we discussed research work that enables the development of mixed so- 
cieties of communicating plants and artefacts. The development of PLANTS ontology 
is a central point of our research. By means of developing a basic plant ontology, 
plants may become an active part of our ambient intelligence and they could become 
an information source and active members of a communication process with impact 
to plant crop management but also to domestics and other places where plants and 
humans have interfacing possibilities. Plants could provide additional sensing func- 
tionality to be used by humans for example as environmental quality markers. Even- 
tually, it will enable totally new potential to decide about food plants quality during 
growth but also during food plant logistics for example from transportation up to the 
super market. Our immediate plans include the development of a suitable interface to 
“visualize” the ontology to the user. Demonstrating decision-making and plant 
monitoring processes based on such an interface to interested end-users we believe 
that would have a significant impact in terms of creating practical exploitation possi- 
bilities. 
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Abstract. In this paper we describe the issues that need to be addressed when 
setting up an aware environment occupied simultaneously by several users. 
Combining the delivery of services for various users simultaneously requires 
the setting up of users profiles as a record of their needs, requirements and de- 
sires. We are thus interested in a assessment of the requirements and specifica- 
tions of user profiling. Furthermore there is also a need for the merger of multi- 
ple user profiles. As we are involved in the development of a smart family 
home and a responsive Exhibition booth, we will investigate user profiling and 
profile management within these two contexts. We finally discuss some issues 
that we consider detrimental to the success of aware environments. 



1 Introduction 

An aware and responsive environment is one that addresses user needs, requirements 
and desires (NRDs) by delivering an experience based on a dialogue and some under- 
standing between the environment and the user(s). Such an environment is seen as 
engaging the user and reacting to his actions. 

In the case of several users in the same environment, the environment could re- 
spond to each user individually or it could responds to users as a cohesive group. It is 
interesting to note the shift of paradigm. From one of the inhabitant or visitor of the 
environment, to one of the user of the environment. In the context of a smart home, 
not only will the user inhabit the house but they will interact with is as if it were a 
system running an application. In this instance the environment's application is no 
less than servicing the users and responding to their NRDs. This is indeed no small 
challenge. 



1.1 Applications 

We aim to develop an aware environment for two applications: one is an aware fam- 
ily home and the other is a smart exhibition space. In both case we are interested in a 
responsive environment that deliver an experience that can be positively qualified by 
the users. It is challenging to precisely describe what experience we would like to 
deliver, especially in the context of the smart home. For the aware exhibition space, it 
is much easier to do so, as we aim to develop an exhibition space that is attractive, 
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entertaining, informative, responsive and agreeable to the users/visitors. For the sake 
of clarity we will illustrate this paper with examples from the aware family home. 




1.2 Environment Awareness 

Environment awareness for multiple users is based on ( 1 ) the environment detection 
and observation of individual users, (2) the merging and combination of multiple user 
NRDs, (3) the resolution of any conflicts that might occur, shall they be of resources 
(e.g. one TV set for all the family) or of interest (father wants children to watch spe- 
cific program) and, (4) the environment adaptation to the users as a cohesive group 
and as individuals. 



1.3 User Location 

Within the environment, the user location is essential for localised and personalised 
service delivery. Knowing where the user is, at a particular time, will ensure that the 
environment is not perceived as dumb (e.g. turning on all the lights in the house sim- 
ply because someone opened the main door). The more personalised the service de- 
livery is, for example the selection of TV channels, the more refined the user location 
detection needs to be. 



1.4 Context 

The service delivery context helps refine the perception by the user of an awareness 
of the environment. Time and concurrence of events help improve user experience 
(e.g. watching TV at night implies a lower sound volume) and improve user profiles 
(e.g. always listen to radio while drinking coffee). 
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2 Profiles 

In our opinion the environment awareness depends on the information the system has 
of the users. Such information is continuously collected and used to construct user 
profiles. Individual profiles are then refined and updated each time the users are 
within the environment or when further data is fed to the environment. This emphasis 
the importance of the environment being aware of its users. Indeed, in the context of 
profiling, the environment awareness is useful for two purposes. First to continuously 
update user profiling that is then fed to the environment responses to the users ac- 
tions. Second, to localise users, and deliver a response that might be personalised, 
localised or specialised. 

User profiles were suggested as an improvement for a variety of applications. 
From query enhancement (Korfhage, 1984) and digital libraries (Amato, 1999), to the 
personalisation of websites (Goel, 2002) and, enhanced interpersonal communication 
(Lukose, 2003). Current trends are for the integration of user profiling in the delivery 
of services for an aware environment such as Familly Interactive TV (Goren-Bar, 
2003), or exhibitions (Kraemer, 2002). 

We have run a survey of patents in the area of user profile and have compiled the 
following table. 



Table 1 . Review of patents 



Patent # 
US2004141003 

US2004128156 

W02004055745 

US6757691 

GB2396934 
US20041 17357 

TW566039 

W02004052010 



Topic 

Maintaining a user interest profile reflecting 

changing interests of a customer 

Compiling user profile information from multiple 

sources 

User Profile Portability 

Predicting content choices by searching a profile 
database 

Personalised profile update 

Method, system and program product for identify- 
ing similar user profiles in a collection 
System for providing personalized services 
Recommendation of video content based on the 
user profile of users with similar viewing habits 



2.1 Profiles 

Profiles are a set of characteristics and properties that describe attributes, behaviour, 
and rules of engagement. A profile is defined as a formal summary or analysis of 
data, representing distinctive features or characteristics (ref. dictionary.com). A pro- 
file can be used to describe either single or group of elements. In our current context 
elements could be individuals, services, products, or systems. We are focusing on 
profiles for two homogeneous sets of elements, specifically: Services and Users. 
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2.2 Service Profiles 

A service profile includes the explicit attributes and meta-data of the service being 
delivered such as theme, name, brand, model, genre etc. The options, the operational 
requirements, and the availability of the services are also part of this type of profile. 

2.3 User Profiles 

A user profile is a combination of user identity, interests and NRDs. There is a de- 
crease of importance in the NRDs from the needs to the requirements and then the 
desires. Needs are the most important user specifications as they are essential (e.g. 
sleeping). Requirements are necessary for a normal activity (e.g. Telephone). As for 
desires, they are the least important user specifications. The User profile contains user 
preferences regarding the service to be offered. This will reduce or eliminate all to- 
gether the otherwise necessary dialogue between the environment and the user to 
specify some options and parameters. In the case of a simultaneous presence of sev- 
eral users in the environment, a merger algorithm must be implemented and be adap- 
tive and reactive to users changing interests. To add some difficulties such changes in 
interest can be detected either from group or individual behaviours. 

A user profile consists of a set of specifications, characteristics and parameters that 
describe the user and his NRDs. The profile contains as well information about the 
user habits, preferences and traits. Finally the profile includes the user role, privileges 
and status. 

The creation and the management of a user profile is based on three parameters; 
(1) The user history and past NRDs as provided for example by a questionnaire when 
a new service will be offered, or when a service require specific information. (2) The 
general and implicit user preferences as generated from the observation of the user 
behaviour and finally. (3) The more explicit user feedback as provided by the user 
actions and responses on the service delivery. 




Fig. 2. Human needs 
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2.4 Mapping of Profiles 

There are three profile mapping scenarios. A one-to-one mapping (1:1), a one-to- 
many mapping (1:M), and a many-to-many mapping (M:N). In the case of (1:1) map- 
ping there is no need for profile merging, as there is no conflict (either of resources or 
of interests). In all other cases merging is an essential and an un-resolved issue. 



Table 2. Mapping of profiles and necessary mergers 





1:M 


M:N 


Service to User 
User to Service 
User to User 


User Profiles Merger 
Service Profiles Merger 
User Profiles Merger 


Both Profiles Mergers 
Both Profiles Mergers 
User Profiles Mergers 



3 Merging Profiles 

Our aim is to develop a method that addresses the issue of merging multiple profiles 
and the resolution of resources and services conflicts within the context of an aware 
responsive and adaptive environment. We do not intend to develop a profile tech- 
nique or an aware environment per se. Our objectives are complementary to such 
endeavours, but we are focusing on the issues related to the processing of profiles 
when multiple users are using/present in an aware environment. 

We propose the merging the profiles of the multiple users of the environment be- 
cause we want to relate each user to the environment. Such relation can be between 
one user or between several users and the environment. In the case of several users 
there could be a concurrent or a separate relation. The environment responsiveness 
depends on the processing of user profile and the delivery by the environment of 
services and supports in-accordance with some rule of engagement. In this context it 
would be useful to determine the common features and trends that characterise the 
environment users. This would ensure a more effective and probably efficient opera- 
tion of the environment, and if all users share a same need then that need will be 
given high priority. We see a co-relation between the commonality of profile features 
and the importance and priority of environment response. 

There a need to integrate the user profiles in the awareness of the environment, but 
there is also a need for a merging of the users' profiles to ensure a cohesive environ- 
ment response. 



3.1 Why Merging? 

The user profile merger is used by the environment to either (1) modify and influence 
the environment response to the users, (2) or to concurrently respond to the users, (3) 
or finally to direct an environment request to the users. 

The first case occurs when there is neither a conflict of resources or a conflict of 
interest. The second case happens when there is a conflict of resources and finally the 
third case is for situations when there is a conflict of interest. A conflict of resources 
is typically about sharing some facilities or services between users such as one TV in 
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the home. A conflict of interest is when some user(s) want to have influence over 
other user(s). 

We are concerned with profile merging and management to gain from the envi- 
ronment awareness what are the users trends, traits and habits. Such a merging would 
also help highlight and rank common and individual characteristics. It is all done with 
the aim of delivering an experience that is at least matching if not going further than 
the users expectations. Another issue is the timing of the user profiling. It could be 
Real-Time, delayed or off-line. There is no one fit all timing and depending on the 
circumstances either a real-time user profiling occurs especially when there is explicit 
user input while for other cases a delayed or even off-line profiling occurs. The later 
timing suitable for example in the initial setting of a service. 

3.2 Merging Techniques 

The merging techniques that we will use are novel and are based on the statistical 
analysis of vector distribution in the meta-data space. Currently there are three merg- 
ing techniques that could be used: Boolean logic, Vector space model and probabilis- 
tic model (Chen, 2000) and we wish to further improve as well as combine them. 
Collaborative filtering systems have also been used (Kohrs, 2000), (Ko, 2003). 

Boolean logic is based on the merging of the profile by similarity reinforcement of 
the profile parameters and at the same time the mutual exclusion of conflicting pa- 
rameters. The Boolean logic has it limitations, as the weighting of the parameters is 
difficult to include. The vector space model has more potential for weighting. Essen- 
tially each parameter of the user profile is associated with a dimension in a vector 
space. The weighting is translated into a coordinate along each dimension. Limita- 
tions of this technique lie in the lack of scope for predictability and merging. Finally 
the probabilistic model, which relates to the assessment of the frequency of occur- 
rence of a parameter, has limitation in that there is not always a correlation between 
the frequency and the importance of a parameter (e.g. One of the user has diabetes). 

We propose to investigate a vector space model combined with a feedback mecha- 
nism based on the comparison of predicted and actual users behaviour and environ- 
ment response to correct the vector describing the merged profiles (V mp ). In other 
words, if the actual vector V mp value is different from the predicted V' mp then a correc- 
tion occurs and is fed back to the environment awareness. 

The value of V' mp is predicted from the likelihood of occurrence of events, re- 
sponses and behaviours. This likelihood is evaluated by comparing in the time do- 
main discrete series of Upvalues. As a result one is expecting a learning curve in the 
environment's awareness and the need for frequent corrections at the initial stages of 
the system. 

3.3 Selection of Metrics 

There are many metrics we could use for the calculation of disparity between pre- 
dicted V' mp and actual V mp . We will focus on three, namely Euclidean, Mahalanobis 
and Battacharyah. The choice of metrics is made according to the level of co-relation 
and co-variance, if any, there might be between the different dimensions of the V mp . 
One of the most important issue to address is the dimension reduction, in particular 
the elimination of redundant or irrelevant dimensions. Reducing the dimensions of 
the vectors will ensure the faster processing of the data. 
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4 Environment Response: Services and Experience 

The User profile must include information pertinent to several domains related to the 
quality of experience (QoE) and the Quality of Services (QoS) each user will be ex- 
pecting from the aware environment, he profiling of users we are relying on is based 
on the continuous monitoring of user position, actions, and behaviour. The QoE de- 
pends on the variety and comprehensiveness of the options available as well as the 
granularity of the profiles. The QoE also depends on the solving of conflict of inter- 
ests between users as well between different requirements within the same user pro- 
file, As for the QoS, it depends on the services supported by the environment and the 
refresh rate of the actualisation of the environment awareness. It is also linked to the 
richness of the user profiles and the effectiveness and efficiency of the profile merger. 

There could be conflict between the QoE and the QoS. As the environment serv- 
ices become more comprehensive and efficient there is a risk that the environment 
becomes far too adaptive and responsive to the user. What we mean is that the envi- 
ronment could behave like the user's genie fulfilling all his wishes and desires. There 
is a need to define a domain and a protocol of responsiveness of the environment. 
Otherwise we could deliver an environment that is so finely tuned to the user that all 
the random experiences, part of the events of a normal life, are removed. As if the 
user was living in a sensory deprivation chamber. 

Similarly the QoE needs to be sufficiently “entertaining” and “interesting” to de- 
liver an experience of the environment that avoids the “living in a box” syndrome. 
We define entertaining as being hospitable and care taking. As for interesting we un- 
derstand it as being stimulating and intellectually involving. There is a challenging 
balance to strike, as too much involvement from the user would defeat the purpose of 
an aware environment. 

The responsiveness of the environment relies on the right selection of behaviour 
cues and user instructions. Ultimately there is a design decision about who would be 
in charge of the Services and the experience. Is it the user of the environment, the 
designer of the system or the environment itself? Furthermore, our opinion is that the 
environment should play different roles from a reserved housemaid localised and at 
attention to a Gaia (goddess of earth), as in a holistic service embedded in the fabric 
of the environment. Intermediate roles would be a more intruding butler and an inter- 
active agent. 




Fig. 3. From Housemaid to Gaia (pictures from internet) 
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We must also keep this in mind as the ultimate aware environment description bor- 
rows much from the realms of Science Fiction. 



5 Open Issues 

As part of this project we are seeking the optimum rate and domain of adaptation of 
the aware environment. This would ensure we process the right information from the 
user profiles and merge them to obtain a cohesive and useful set of NRDs applicable 
to the environment. We also would like to define the level bandwidth and domain of 
control the users have over the environment. Is the user's experience similar to a dia- 
logue, a menu navigation, or a more demanding and intense interaction. How direct is 
the interaction between the user and his environment is also relevant. The directness 
of the interface has detrimental effects on the QoE, it could be based on switches, 
remote control, token, interface, speech. The duration of the interaction tasks (con- 
tinuous, discrete, contextual, triggered) will also influence the users experience. 



5.1 Conflicts 

Our understanding of conflicts is that of occurrences when there is either a limited set 
of resources available, there is a contradictory set of NRDs, or there is an overlapping 
set of NRDs. 

In the first instance the resources limitation can be addressed. The environment can 
be enhanced with more high-in-demand resources. Some of the conflicts can be easily 
resolved when they involve desires or to a certain extent requirements. The environ- 
ment can either make a decision based on the ranking of the desires and requirements 
as available in the user profiles or could assess the comparative ranking of the users 
(e.g. The father desires come before the children's). However difficulties arise when 
desires or requirements are contradictory or overlapping. In the first instance there is 
a conflict between the users NRDs and it is the environment function to avoid this 
turning into a conflict between users. Two metaphors could be used the intrusive 
butler or the reserved housemaid. With the intrusive butler the system will assess the 
history of the NRDs in conflict and if for example there is a track record of regular 
occurrence and fulfilment of one of the NRDs then it would take precedence. In the 
case of a reserved housemaid configuration the environment will suggest the fulfil- 
ment of the various NRDs using extra resources such as a different room or in the 
case of the TV programme to record on VCR one of the alternative to be shown later. 
As a last resort suggesting a relocation for the users could avoid them entering into 
conflict. 



5.2 Semantic Aspect 

Another issue related to the profile merging is the weighting of the importance and 
significance of each user's profiles and parameters. 
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Fig. 4. Tensions between profiles 

Within the same domain how is the weighting and classification achieved? As an 
example, is the father choice of TV programme overwhelmingly more important than 
the children's one? When the domains are different is the classification and weighting 
achieved using the ranking of users. For example is the parents need for intimacy 
higher than the children need for bedtime stories? 




Fig. 5. Tension Grading 



We call these weightings and classifications tension grades and there are depend- 
ant on the user identity and role in the environment, the domain of the user parameter 
and the user own importance given to that parameter. As illustrated in figure 5 the 
tension grading modifies the outcome of the profile merger. Without weighting the 
merged Vector is a geometric average of the combined profiles. This is not the case 
with the weighting even in a weighting as simple as ranking the different profile pa- 
rameters in order of importance. 

5.3 User Push or Environment Pull 

Deciding who is in charge of the user profiling and the profile merger is a fundamen- 
tal question for this project. In other words is the paradigm applied one of user's push 
or one of environment's pull. This apply also to the adaptation rate of the environment 
as well as the service delivery. We could also argue that the inter-actions between 
user(s) and environment are also subject to this paradigm choice. 
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5.4 Resonance 

The proper adaptation behaviour of an adaptive system depends on the individual 
user profile in a particular context of use with a particular application. If two adaptive 
systems (in our case the human being and the adaptive home appliance) are coupled 
with each other, the following aspects have to be taken into account: (1) the adapta- 
tion rate ‘A’ of the environment as a system ‘s’ to the user’s behaviour (R s a), and (2) 
the two different kinds of human ‘h’ influences on the system: (a) the explicit control 
rate ‘C’ based on direct user input to the system (Rhc) and (b) the implicit adaptation 
rate ‘A’ by the user to the system’s output (R hA ). The main challenge of designing 
such a coupling is to avoid an unintended acceleration and/or mismatch between both 
subsystems based on a closed loop coupling. How should be the optimal relation [R hc 
+ RhA] <-> [R s a] established? What is the proper balance between [Rhc] <->[RhA]? Even 
more challenging, in the context of a multi-user environment the adaptive system has 
to resolve the possible trade-off between all present user profiles. 



6 Research Strategy 

Single user profiling technology is already available; the main interest of this project 
is to extend these approaches to multiple-user profile merging. 

We have divided our project into the following phases: 

• Investigation of various frameworks and prototypes for user-profiling architec- 
tures and software. 

• Definition of an architecture for multi-user-profiles merging. 

• Installation of different MUPEs for two different applications in a home environ- 
ment and in an exhibition space, integration and test. 

• Data collection with several groups of test subjects for validation. 

• Data analysis and extraction of significant and relevant parameters, which could 
be used for assessing the quality of the implemented MUPEs. 

• Improving and optimising one MUPE as final delivery. 

6.1 Project Outcome 

We will deliver expertise in the area of designing a MUPE within the context of an 
aware home and an exhibition booth. Prototype installations for two different MUPEs 
will be available during the project; at the end an optimised version will be opera- 
tional. Data gathering and analysis software will be developed to provide semanti- 
cally relevant input to adaptive appliances and services. Deliverables include different 
working systems, which could be used for other projects, composed of what would 
have been determined as efficient and effective combination and integration of single 
user profiling techniques. This project would be relevant transition and evolution 
from the expertise built-up by us in the area of our research in the area of automatic 
mental model evaluation (AMME) which focuses on the analysis of behavioural data 
describing the interaction between users and systems in learning situations (Rauter- 
berg, 1993). 
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6.2 Industrial Relevance 

Interests on this project have been expressed to enhance the adaptivity for specific 
appliances in a smart home and a responsive environment. We have two major com- 
panies interested in this project, and we are cooperating with them to further define 
and specify the aim, scope and content of the MUPE project. 



7 Conclusion 

We would like to establish expertise in the selection of relevant information about 
users. One should know what to collect and how. Are the user desires, needs, re- 
quirements to be considered, and what is the importance of the users' behaviour. The 
Multiple User mErger (MUPE) system is a combination, an overlapping and a classi- 
fication of different user profile parameters. We propose to develop a methodology 
and the technologies necessary for the completion of a MUPE system. Finally how 
reliable is the profiling and the management and merging proposed is an issue we 
would like to address. 

We hope to have clearly stated what are the challenges and issues related to profile 
merging within the context of an aware environment. 
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Abstract. In this paper we propose a tangible cube as an input device for play- 
fully changing between different TV-channels. First we consider several design 
approaches and compare them. Based on a cube that has embedded gravity 
sensing and wireless communication capabilities a prototype is implemented. A 
3D graphical representation of the cube is shown on the television screen. On 
each face of the cube a TV stream is rendered. The motion of the cube on the 
screen is connected to the rotation the user performs using the real tangible 
cube. Our hypotheses is that users can use the cube to browse between channels 
and to zap intuitively and playfully gaining a improved user experience even if 
the efficiency is limited compared to a remote control. We report on initial user 
feedback testing our hypothesis in witch we found out that users can easily use 
the cube without instructions and, despite technical limitations, see it as an im- 
provement of current systems. Finally we discuss the issues that emerged from 
user’s feedback. 



1 Introduction 

Current home entertainment products show a wide range of user interfaces and inter- 
action styles. In particular remote controls of users - if well designed - an efficient 
means to operate such devices. Recent trends in mobile devices and networking show 
a development towards universal remote controls based on web technology and mo- 
bile devices, such as PDAs and mobile phones [1,2]. These approaches can offer a 
very efficient way for controlling and manipulation functions, however considering 
the situations in which home entertainment is used efficiency is not the only goal. In 
[3] Bill Gaver explains the concept of the Homo-Ludens and that design has to take 
this into account. Humans are described as playful and creative explorers of their 
environment and hence good design has to address these basic needs, too - without 
compromising the effectiveness of a tool. 

We looked at the domain of controlling a television set (TV). From informal ob- 
servations and reports we can conclude that changing channels is the predominant 
actions users are taking. The goal when users who change channels are manifold: 
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• switching to a specific channel 

• browse the program, getting an overview of “what is on” at the moment 

• zapping (looking at various channels one after another for a short time) 

In the case where the users browse or zap the efficiency of selecting a specific chan- 
nel is a minor concern. Here the overall experience in using the device is at the centre. 
There are further functions where efficiency plays a more dominant role, e.g. switch- 
ing on and of the TV, controlling volume, and switching views (television and 
teletext). Most other functions are to our knowledge used rarely in daily use. 

In this paper we report on a project where we explore a specific design option that 
introduces a playful element to controlling channels on a TV. Our focus was on pro- 
viding means for changing a channel in a playful and pleasant, but hence effective 
way. Other functions could be included in the interface but this is not discussed fur- 
ther in this paper. In the next section we briefly explain the idea how we assessed the 
concept by talking to users. This is followed by a detailed description of the imple- 
mentation. We discuss then our use experience and initial user feedback. Finally this 
is concluded with a discussion showing general issues for building tangible user inter- 
faces for this domain. 



2 Idea and Initial User Feedback 

The basic idea for a new user interfaces for changing TV channels resulted of infor- 
mal observations of people watching television. Browsing and zapping is with many 
people by now a common way of using these systems. Our experience form the UK 
and Germany suggests that breaks for advertising and a fairly large number of chan- 
nels available are factors that enforce this way of using a TV. When browsing and 
zapping it can be observed that people use the “next-channel” and step through all 
programs rather than selecting a specific channel. These observations lead to the idea 
to create a playful interface for channel selection. 

The basic concept is to use a handy cube that allows changing the channels on TV 
by physical movement in a 3D-space. More specific a virtual version of the cube is 
shown on the television screen. On each phase of the cube a TV stream is rendered. 
The motion of the cube on the screen is connected to the rotation the user performs 
using the real cube. The user now can rotate the real cube in order to see the different 
sides and the TV channels respectively. If the cube is put down and not moved any- 
more the TV channel currently facing the user on the virtual cube is enlarged to cover 
the full screen. As soon as the user picks the cube up again the currently showing 
channel is resized back to the facing site of the virtual cube. Other channels are 
shown on the other sides of the cube. 

Before implementing the system we talked to potential users to get an initial feed- 
back. In particular we were interested in the following two issues: 

• Is the idea of the interface intuitive and easy to understand? 

• Do people like the idea of a playful interface for a TV even if it reduces effi- 
ciency? 
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The first question we approached showing people a sketch with a cube on a TV 
(which channel names on the sides) and a cube in a hand and asking how they think 
this U1 works and how they would explore it. See figure 1 for a sample sketch. In this 
discussions we potential users we gained evidence that they could easily understand 
the concept and that they would do the right things to explore the interface. 

After people understood the user interface we asked how they felt about draw- 
backs, especially with regard efficiency of direct channel select. For people this 
seems to be a minor concern as they generally do not find the task of selecting the 
channel time critical. One even mentioned that when zapping - the idea is to spend 
time - and hence playfulness is more important efficiency. 

Based on this feedback we explored options for implementing such a system and to 
explore its usefulness and usability. 




Fig. 1 . To give users an idea what we would like to build we used a simple sketch that explains 
the basic concept. After people got the idea we asked initial user feedback. 



3 Exploring the Design Space 

Before implementing a full prototype we explored several design options. The first 
issue was related to finding a tangible object that is the physical manipulator and has 
a visual counterpart on the screen. A cube offered the affordances that proved to be 
most interesting. It can be easily handled by people, the manipulations (picking it up, 
putting it down, rotation and translation) are intuitive and it can be placed on any 
horizontal surface. Furthermore a cube offers clear faces to place the information (TV 
channels) on. 

We explored further geometrical shapes, such as a hexagon, a cylinder, and a 
sphere. The hexagon seems an interesting object, this is as also explored by Butz et al. 
in [4], however we found the manipulations less intuitive. The hexagon also offers an 
easy way for presenting information on it. The sphere and the cylinder are very play- 
ful to use, however put them down a surface is a problem (the role away). Addition- 
ally visualization is discrete information units (such as separate TV streams) is more 
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difficult and there is no clear natural mapping. In figure 2 a sphere and a cube are 
shown next to the prototypical hardware that was used for sensing in the initial test. 
The prototyping tool is based on a Smart-Its hardware [5, 6], Similar systems have 
been used in other projects, e.g. [7]. 




Fig. 2. In the initial design phase we explored different physical manipulation devices. The 
examples shown here are a sphere (left) and a cube (middle). These devices are built based on - 
Smart-Its - a platform for prototyping ubicomp applications. 



For implementing the visualisation we chose to present a 3D model of the physical 
object manipulated by the user. For the representation of the TV channels we consid- 
ered several options. The low fidelity would present only names and logos of the 
channels on the object. The high fidelity version should render continuously the cur- 
rent life TV stream on each face of the cube. An option in between is to have screen 
shots of all channels as representations. Given the aim to create a playful experience it 
appeared important to seek a high fidelity solution, even if this requires a number of 
receivers providing the TV feeds. 



4 Implementation 

To further explore this type of user interface we build a fully functional prototype of 
the system. 



4.1 System Architecture 

The overall system consists of two main components. The cube as tangible user inter- 
face is one component and the visual representation of the cube on the screen is the 
other. This is very similar to the sketch in figure 1. To make the system functional 
further components are required: the RF-receiver linking the tangible UI to the sys- 
tem, the driver code converting the sensor information into meaningful geometrical 
information, and the video source that can be mapped to the faces of the cube. In the 
following we explain these components in more detail. 
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4.2 Tangible UI and Receiver Unit 

The hardware customized for the project is based on two Smart-Its that are wirelessly 
linked. The communication is via short range RF (up to 30m, FM) using a simple 
broadcast protocol. Full technical details can be found at [8]. 

Attached to the Smart-It included in the cube is a sensor board which is equipped 
with two accelerometers that are orthogonally adjusted. Each accelerometer includes 
2 orthogonal adjusted sensors. The sensors used are accelerometers by Analog De- 
vices (ADXL311) offering a measuring range of +/-2g. By using this arrangement, 
also depicted in figure 3, we get acceleration in three dimensions (X-Y and X-Z). The 
four resulting raw-data streams are constantly transmitted at a rate of about 30Hz per 
channel. The raw values are received by a second Smart-It which forwards the data 
over a standard RS-232 serial line to the computer running the visualization. Con- 
cepts for optimizing the design of a three-axis accelerometer are discussed in [9], 

The raw values are collected by the input handling of the driver unit and passed to 
the data analysing unit. As one of the original four data streams is redundant only 
three of them are taken into account. To deal with inaccuracies several values are 
recorded over a short time and smoothened by applying a running average. The pre- 
processing causes a delay of about 50ms, but improves the data significantly. 

Physical arrangement Abstracted model 





— - = Cube Case 

«— i ► = X. Y, Z acceieromter 

= X, = redundant accellerometer 



Fig. 3. On the left the physical arrangement of the two sensor devices inside the cube is de- 
picted. On the right the abstracted view is shown. 

Afterwards for each sensor orientation values are linearly converted to a floating 
point in the interval [-1.0, 1.0]. The minimum and maximum values are calibrated to 
the accelerometer’s two extreme values caused by gravity (see Figure 3). The final 
resolution achieved is sensor-dependent approximately limited to resolution of 0.02 
consequently to a total amount of 100 steps over the total calibrated range ([- 
1.0;+1.0]). The three converted values are paired in three different groups each with 
two elements (X and Y, X and Z, Y and Z). For each group it is now possible to cal- 
culate the angle between the gravity vector and one of the vectors in the group. This 
is achieved by applying the arctan function on the individual group quotients: 
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x n x y 

a = arctan —,p = arctan — , y = arctan — 
y z z 

As x, y and z are values between -1.0 and 1.0 the arctan delivers angles between - 
1 80° and 1 80° giving a full 360° range, see Figure 4. In theory this results in 3 angles 
between gravity and one vector in each group. 




Fig. 4. If the components of the gravity vector are known the angle of the cube related to the 
gravity vector can be calculated. 

Unlike the angles related to gravity, the rotation around the worlds Y-Axis (that is 
perpendicular to the floor) can not be measured by the sensors used in the prototype 
and is therefore simulated by a state machine. Once calibrated it keeps track of the 
according Y -rotation and can consequently decide which side of the real cube is cur- 
rently facing the user. Based on that information the current world’s Y Angle can be 
set in 90° steps. For details on the state machine algorithm see [10]. Analysing the 
gravity-related values gives the result that, depending on the side that is up, always 
one angle seems to have a very low resolution and in the worst case flicker in steps of 
up to 180°. Logically this is no problem as only 2 gravity related angles are needed. 
However, practically it has to be decided which of the three angles we employ in the 
final transformation and also to avoid using the flickering angle. Before defining an 
efficient selection strategy we need to explain the observed behaviour. 

The effect is due to a simple mathematical relation of the x, y and z values. As 
long as the cube is not being translated, the three components together always form 
the g- vector which is, like mentioned earlier, calibrated to the absolute value of 1.0 

(X~ + y~ + Z = 1 ). Looking at X + V = 1 — z the geometrically interpretation 
would be that all x-y value pairs, plotted in a Cartesian coordinate system, lie on a 

circle with radius 1 — z 2 . The closer the value for z is to the absolute value of 1 the 
smaller the radius for the value pairs of the X-Y group becomes. As sensor data has a 
limited resolution and is not optimal this cases problems. Consequently this means the 
resolution decreases to 0 with z increasing to the absolute value of 1 . Figure 5 shows 
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an illustration for an example resolution. Considering the angles - a lower resolution 
in values causes a lower resolution in angles and this explains the formerly mentioned 
flickering as the resolution is nearly 0. 




Resolutions: = 80, = 24 



Fig. 5. The XY resolution decreases to 0 with z increasing to the absolute value of 1. The 
image shows an example with a decrease of resolution with a base resolution of 10 by 10. 

For finding a strategy to employ the calculated angles correctly, we look at two 
different resolutions during a transition. We can approximate the resolution of a 
group with the current radius r by using the formula for the circumference: 

res approx = ■ res v , where res v is the value resolution 

(in our case res v = 100). 

Looking at the approximated resolutions of the XY and XZ group in dependency 
of the z value (this corresponds to a rotation around the X-Axis, see fig 3 for illustra- 
tion) gives the following progression of resolution, depicted in figure 6.: 




Fig. 6. Plotting the resolution of the of XY and XZ groups in dependency of the z-values. 
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Calculating the corresponding YZ-angle at the point of intersection delivers a 
value of exactly 45°. Repeating this procedure with all different values and cube 
states gives the significant transition angles of -135°, -45°, 45° and 135°. In the ex- 
ample depicted we exchange the XY-angle with the XZ angle, as soon as the YZ- 
angle crosses one of the given values. 

This results in the following algorithm for handling the validity of the three angles. 
Assuming that there are three groups gl, g2, g3. Then the following two rules apply: 

I) Exactly 2 groups must always be valid. 

II) If a valid group’s angle crosses -135°, -45°, 45° or 135° switch the valid- 
ity state of the other two groups. 

Now you always have two valid angles to mount on the worlds X and Z rotations 
and a Y angle given by the state machine. The state machine now sets up a basic 
transformation matrix bringing the cube its current state. After that the state machines 
decides on how to additionally apply the current valid angels on the worlds X and Z 
rotations. This matrix, applied to an arbitrary geometry would now transform the 
structure according to the real cube’s rotation state. Eventually the final transforma- 
tion matrix, the underlying raw and the average arrays are offered via the driver inter- 
face, Additional an event engine analyses the average history and calculates noise- 
level over a range of one hundred values. By also considering the calibrated sensor 
data it can decide, if the cube is lying on a plain object and fire the according events 
when it enters or leaves this state. 



4.3 Screen Interface and Visualization 

The Screen Interface generally consists of three software parts and a screen device. 
The input module access the data provided by the driver and forwards it to the pro- 
gram logic. Depending on the occurred events it chooses either full screen or zapping 
mode and additionally updates a global transformation matrix that is later used by the 
graphics engine to calculate the virtual cube. The program logic also assigns the six 
video sources that are for now just static. 

A parallel thread renders the videos in the background and derives ready textures 
to be used in the final scene. To perform the final drawing we rely on the graphics 
engines 3D capabilities. It uses the transformation matrix, updated by the data acqui- 
sition thread, to calculate the virtual cube’s geometry. Finally the different textures 
are accordingly projected onto the different sides. This is rendered into a three- 
dimensional environment covered by a two-dimensional descriptive layer containing 
instructional information. 

Additional, each sign of the cube is numbered according to the real cubes enu- 
meration. This is a helpful tool for recalibration as it shows the Y-rotation the state 
machine is currently assuming. An aberration of the real cube and the virtual cube 
thus indicates a mis-calibration. 

The graphics engine uses Microsoft’s DirectX because it supports existing hard- 
ware accelerators as well as various types of audio and video formats. Possible video- 
sources are all Windows-Media supported local or streamed file types like .AVI, 
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.MPEG, .ASF or .WMV. To provide TV life feed we have one stream available for 
each channel. Depending on the used graphics hardware, the final presentation can be 
output to all supported display types and resolutions. 



4.4 Usage of the System 

The user interaction with the interface is quite simple: The TV is turned on and as 
usual a channel is showing. The cube is lying next to the device and can be picked up 
at any time. As soon as this happens, the currently showing channel is scaled down 
revealing the three-dimensional environment including the virtual cube and after- 
wards snaps on the side that it is assigned to. Now the screen of Figure 7 is showing 
and the cube is reaction on the user’s input. As the user rotates the real cube to a new 
position new live streams are shown. By this she is able to see up to three videos at 
the same time. The user is has the option to preview the adjoining sides by navigating 
to the next destination. Assuming that the user’s goal is to bring her favoured media 
to the front position (which is the best visible one) the program-logic considers it as 
the user’s selection. There is also no change of that side if the user lies the cube down, 
consequently the selected media stays in focus. Based on that assumption the applica- 
tion chooses the front media to be enlarged to full screen as soon as the user puts the 
cube back on a surface. The user can repeat they this procedure as often as she wants. 
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5 User Feedback on the Prototype and Discussion 

After finalizing the prototype we asked 6 users to casually explore the system. In this 
phase we were in particular interested in potential problems that occur when using the 
system. 

One problem that became quickly obvious is the missing capability to sense the 
third angle due to the sensor system used and due to the fact that the sensing is related 
to gravity. The experience for the user is significantly different for motions in differ- 
ent direction. In two directions the reaction of the virtual counterpart is very direct 
and immediate, whereas the coupling in the third dimension is very rough (as de- 
scribed above). This different behaviour conflicts with the user’s intuitive under- 
standing of the connection between the real and the virtual cube. 

However having one dimension that does not respond immediate also introduced a 
certain degree of freedom to the usage. In our sessions we recognized that users fa- 
cilitate this as a feature. By relying on the system to ignore a limited Y roation the 
user can reset a position - this is similar to lifting a mouse for readjusting it into a 
more comfortable place without manipulation in the virtual space. 

A further issue that we investigated was the fact that we had to decide what infor- 
mation (channels) to put on the limited number of faces on the cube. There was not 
final conclusion but several options seem appropriate depending on the user and the 
environment. One approach is to give the user the opportunity to freely assign the 
sites with her favourite channels. Another strategy would be to randomly assign TV- 
Channels to the sides and exchanging them on the fly as soon as they are rotated to a 
non-visible state. This is more related to the idea of a playful interaction. 

Similar to the feature image-in-image the cube shows more than one stream at the 
time. This is similar to a feature more and more TV-de vices have namely the possi- 
bility to add a little window of another channel in one of the full screen’s corners and 
be able to quickly switch between them. In this respect the cube interface offers more 
options as there are up to 3 channels on at once and further 3 are quickly available. 

The overall user feedback was that the user interface provides a intuitive and more 
playful user interface to a TV, even with the limitations mentioned above. 



6 Conclusion 

In this paper we presented the idea, the concept, implementation, and initial user 
feedback on a new interface for a TV. We explored different types of potential de- 
vices that can be used as a tangible UI and decided to use a cube because of the affor- 
dances given and the clear mapping to visualization. 

The implementation consists of several parts: a tangible UI that is wirelessly con- 
nected to the system on which the visualization is realized. To acquire information 
about the user’s manipulation of the interface a unit sensing four axis of acceleration 
is used. The acceleration values are converted and used to move the visual represen- 
tation of the screen. 

Initial user feedback suggests that having user interfaces that are effective and 
playful can create a good user experience. In cases such as watching TV providing a 
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playful experience can be more important than pure efficiency. Currently we are 
improving the sensor system based on the initial feedback and plan for a larger user 
study. 
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Abstract. We currently witness a massive digitization of domestic materials 
e.g. photos, music, calendars, recipes, notes, messages. This digitization pro- 
vides new conditions for how we interact with materials as well as how users 
shape the ambience of their homes. Observing the qualities of physical materi- 
als in the home, the process of digitization risks loosing the qualities of the 
spatial distribution, the aesthetics, and common reference points offered by 
physical materials and places in the home. We present a concept of domestic 
hypermedia (DoHM), which exploits potentials of digital materials and at the 
same time allow people to interact with digital materials in engaging ways pro- 
viding rich experiences when organizing and using digital materials in homes. 
We present an infrastructure and design concepts that offer: ambient access to 
digital materials, common reference points, and collective experiences. 

Keywords: Ubiquitous hypermedia, domestic technology, augmented reality, 
context awareness. 



1 Introduction 

We currently see an increased digitization of domestic material. Photos, movies, cal- 
endars, recipes, notes, banking, messages from school etc. increasingly become dig- 
itized and thus no longer have an inherent physical form [24], Historically, the work- 
place has undergone a similar transition; however, the home is quite different from 
the workplace in many respects [3], Activities in the home are less task-oriented [17]; 
the rationalities of work in terms of production, efficiency and organization of labor 
do not necessarily transfer to the home [3], Moreover inhabitants continuously re- 
configure and appropriate their homes both to express their identity to the outside 
world [20], but equally to capture their own history and biography [26] below. Thus 
homes have their own aesthetics where the visible, physical “information material” 
often play a role in expressing identity e.g. in terms of the books being read or the 
music listened to by the inhabitants. These qualities of the domestic environment are 
important to understand when designing hypermedia for the home. 
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Thus in order to design future domestic hypermedia systems, we argue that it is 
necessary to study the characteristics of the existing use of both physical and digital 
materials in homes today. While we are not the only one to take this point of stance, 
our focus is different from other studies. E.g. some groups have focused on under- 
standing patterns of domestic routines ([3], [2]) where we have focused more broadly 
on eliciting the different roles domestic information materials may have, not restricted 
to patterns of routine. Others have investigated what constitutes the home experience 
[5] more generally, but this study did not reveal how specific materials formed part of 
constituting the home experience. Our studies are inspired by the studies by [3][2], 
who investigate how the physical space of the home is used to coordinate the han- 
dling of paper mail, but our focus is on domestic materials beyond paper mail. The 
mission of this paper is to learn how to take the best from both the digital and physi- 
cal sides and combine into concepts for future domestic hypermedia systems. 



2 Empirical Studies 

In the following, we describe the approach taken in studying domestic hypermedia 
and we provide examples of the information materials, which exist in private homes. 
We have undertaken qualitative empirical studies in six private homes inhabited by 
different types of families, ranging from singles, over families with children living at 
home, to communes. We visited each household once and stayed for approximately 
one and a half to two hours each place. In each home, we asked the inhabitants to take 
us on tours around the house [22]. We focused on how people organize domestic 
information materials in general, i.e. media, letters, memo lists, newspapers etc., and 
we interviewed people about the rationale, and history of the physical placements of 
both physical and digital materials. We asked about ownership and possible conflicts 
around placement of materials [19]. We captured data from the homes through video 
recordings, and pictures. The six visits form part of a continued investigation into 
domestic technology use of our own [21], [13] as well as others ([3], [19]). 




Fig. 1 . Left : Dining table with brochure from a theme park recently visited. Middle'. Mixed pile 
of children's’ book, adults’ book, and unused wrapping paper awaiting further distribution in 
the home. Right: Table in Entrance hall with a pile of things to remember when leaving the 
house. 

Domestic Surfaces 

Crabtree et al. [3] were among the first to spell out in details how the specific surfaces 
in the home designate certain meaning and serve to coordinate domestic life. In their 
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study they focused on paper mail (ibid), and described how if for instance a letter is 
found by the porch, it suggests that no other person has dealt with it yet. Confirming 
the studies of Crabtree et al., but focusing on domestic information materials more 
broadly, our studies also suggest that the spatial layout of the home is indeed used to 
coordinate and structure domestic information materials. 

Figure 1 illustrates how different placement of physical materials in the house have 
different meanings and how the spatial layout serves structure domestic information 
material, e.g. the material on the table by the front door is placed there to remind the 
user to bring it when leaving the home. To illustrate the range of different surfaces 
and their characteristics table 1 provides a set of examples from the homes visited. 



Table 1. Examples of different surfaces and their characteristics 



Surface 


Surface characteristics 


Notice 

boards 


Information which is ‘one click away’ 

Contact information. Notifications, Memorable material 


Table 


Traces of earlier activities 

Piles awaiting further distribution in the home 

Mixed media activity spaces 


Entrance 

hall 


Things to remember to bring along when leaving the home 


Shelves 


Media collections. Not always sorted optimally for searching, but rather in terms of 
aesthetics and of expressing identity. 


Frames 


Staging of personal memory 



We also see how other interests than supporting browsing and searching come into 
play, when placing materials in homes. E.g. as illustrated in Fig. 2, only selected 
cookbooks are placed so they are immediately visible to visitors whereas the rest of 
the collection is kept in a place where visitors not normally come. 




Fig. 2. Left : Shelves in kitchen where selected cookbooks are on display. Right Shelves in 
home office, holding the other part of the cookbook collection. 



This is an example of what Palasmaa has described as ‘‘Home is a staging of per- 
sonal memory. Personal space expresses the personality to the outside world, but 
equally important, it strengthens the dweller's self-image and concretizes his world 
order” ([20], p. 6). As homes become meaningful in this way, it is important to ensure 
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that this is also supported in future domestic environments where domestic materials 
are digitized. The highly spatial distribution of physical materials onto a multitude of 
surfaces in homes stands in marked contrast to what happens, as domestic materials 
become digitized and stored on a personal computer in the home. The PC essentially 
provides centralized and individualized access to digital materials. It lacks the spatial 
distribution, the persistence and visibility of physical materials on display in the 
home, as we see numerous examples of in our empirical material. Thus we need to 
learn from the qualities of physical material, which enable spatial information struc- 
turing in the home when designing domestic hypermedia systems. 

Lazy Structuring 

A further lesson from studying physical materials is how it allows unfinished and 
“lazy” structures. This can be seen from the widespread use of piles as a structuring 
mechanism. As illustrated in figure 3, all the homes we visited had piles. For instance 
piles of bank papers and official papers waiting to be filed in binders. 




Fig. 3. Piles as structuring mechanism in homes 

Piles are not as unstructured as it may first seem thanks to the meaning of different 
surfaces of the home, as discussed above. Some are piles of heterogeneous material, 
implicitly ordered by date, as sedimentations of materials. Not an effective structuring 
mechanism for a fast search, but a mechanism, which with the least possible effort 
still puts some structure on the material, largely due to the specific location of the 
pile. The pile in the entrance hall in Fig. 1 is not sediment. It is carefully selected stuff 
which must be brought when leaving the home etc. Thus the specific context of the 
pile provides an additional layer of meaning to the structure. This dimension is lost, if 
most domestic material can only be distributed on one, digital desktop. 

The Structuring Experience 

While one way to be pragmatic with respect to everyday life is to support lazy struc- 
turing mechanisms, an alternative and complementary approach is to design for play- 
ful and engaging structuring experiences thus aspiring people to take time to impose 
structure on their materials. Given the nature of the home, it would be interesting to 
investigate how collective and more playful structuring experiences can be supported. 
A topic we address in our design concepts, presented in the following. 
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3 Domestic Hypermedia - Folding Hyperspaces into the Home 

The empirical studies point to a lot of different ways of organizing physical material 
in the home. However, collaborative spatial organization and persistent visual aware- 
ness of the material in specific places seem to be most at risk in the process of digiti- 
zation of domestic materials. In this section, we propose a domestic hypermedia in- 
frastructure that supports these approaches to structuring digital material. We propose 
a hypermedia concept, where digital hyperspaces are folded into the physical space. 
The notion of folding is inspired by the work on folding within architecture. In this 
tradition, folding is a means to create more dynamic and open buildings by folding 
rooms of one type in between rooms of another type. As described by Lynn [14] 
below, folding employs neither agitation nor evisceration but a supple layering. In 
such a way, that the characteristics of the individual layers are maintained. “A folded 
mixture is neither homogeneous, like whipped cream, nor fragmented like chopped 
nuts, but smooth and heterogeneous” (ibid p.9). 



We propose domestic hypermedia as the folding of digital spaces with physical 
spaces in a way where the characteristics of both spaces are preserved, yet they are 
connected through layering. Thus we envision a multitude of digitally enabled sur- 
faces in the future home. Each surface resides in a particular physical context, in the 
digital environment. The different “surfaces” may with current technology be TVs, 
projections, computer displays, mobile phones, as well as HiFi equipment with only 
an auditory appearance. In the future, we envision surfaces in more persistently visi- 
ble materials like Gyricon [9] and elnk [6] paper, which is material allowing persis- 
tent display, also when the power is turned off. In addition we may see a variety of 
ambient displays and controls [29] like wallpaper, bottles, RFID readers etc. 

Domestic Hypermedia Infrastructure 

The DoHM domestic hypermedia infrastructure (see Fig. 5) is built on top of the open 
context-aware hypermedia framework HyCon [1]. HyCon is a context-aware hyper- 
media framework with a generic sensor layer that enables hypermedia clients to sense 
the context of users and devices registered in the framework. Moreover, HyCon sup- 
ports XL1NK structures, RDF, annotations, and WebDAV servers for collaborative 
handling of content and structures. HyCon among other things supports hypermedia 
composites utilizing XLINK. 



Physical 



Digital 




Fig. 4. Folding physical and digital spatiality 
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DoHM is in particular designed to support folding of hyperspaces into the physical 
space. The empirical studies show that spatial organization and piling of material are 
common structuring mechanism in homes as well as lazy or incomplete structuring is 
observed in many places. Thus DoHM supports spatial hypermedia structuring inte- 
grating the connected physical surfaces as first class composite structures. The DoHM 
infrastructure also supports carrying digital material around both inside and outside 
the home by means of having a physical token (like a cell phone or a smartcard) rep- 
resenting a person’s digital portfolio on the home server and other integrated servers. 



“Surface” 









Apps 



Place 




DoHM 


registrar 




Client 



MediaTray 



PortFolio 





Service 



Place 




Notification 


Portfolio 




Collage 








MediaGate 


reg. service 




service 


service 




service 








service 













d ^ 








d C} 




Storage 


PlaceRep 

DB 




Home server 
(WebDAV)^ 




HyCon 

server^ 








Web 

server 


ET" 

er 



Fig. 5. DoHM - Domestic Hypermedia infrastructure 

The DoHM infrastructure is thus capable of handling the common structures that 
are used in homes, e.g. composites representing play lists, photo albums, and cook- 
books. Meta-data associated with physical objects or digital material like photos are 
represented in RDF. A central user interface for DoHM is the client, which provides a 
2D spatial hypermedia interface including the registered surfaces as objects, as illus- 
trated in figure 5. 

Modeling Home Surfaces 

The various surfaces in the private home needs to be addressable from the point of 
view of the hypermedia system, such that particular material can be presented in dif- 
ferent or even multiple physical locations in the home. As discussed in section 3.2 
surfaces in the private home serve many different purposes and exist in different 
contexts. E.g. things to remember when leaving the home are placed persistently and 
visibly in the hallway. Peripheral awareness of earlier activities is created through the 
traces of material left on dining table. 

This is implemented in DoHM through adding two central concepts to the HyCon 
framework [1] in order to handle places and presentation. PlaceReps represent the 
place or context of a particular domestic hypermedia surface in the home; and 
PresStyles represent the style specification for a given object or surface. 

PlaceReps and PresStyles 

PlaceReps may represent both a specific place like “the refrigerator door“ or an ab- 
stract place like “entrance” which may cover several physical locations in the home 
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including, both hall, basement and terrace. Thus in this way, information that must be 
recalled when leaving the home can be displayed in all the physical entrances. 

In light of the nature of the different materials and surfaces (table 1), some material 
on some surfaces should be able to call for attention in some contexts, e.g. an ap- 
pointment with the dentist, whereas others need to be discrete and aging, e.g. pictures 
from a recent weekend trip with friends. Thus we wish to support the users in speci- 
fying different presentation styles for material and surfaces in the home. We intro- 
duce the PresStyle class to represent the specific presentation style for a material 
object or a surface object. It holds an Id, a name, presentation specification, a priority 
and an optional script. Both surfaces and material objects may be associated with a 
PresStyle; this introduces the need for a combination strategy similar to the issues of 
Dexter Pspecs [7] and the cascade of Cascading style sheets [4], 

We wish to give users maximum freedom in presenting digital material, to support 
the continuous appropriation of the home [26]. For instance, a teenage kid may prefer 
to have a certain skin PresStyle associated with the surfaces reminding her of a favor- 
ite movie on every surface in the room. On the other hand, parents may want to asso- 
ciate an intrusive red colored notification PresStyle with a reminder note that they 
place on every relevant surface in the home. In this case there should be a combina- 
tion strategy that allows both the teenager to maintain her surface skin and to be noti- 
fied about the important note from the parents. This is solved by a calculation of the 
cascade based on the cascade priority attributes of the involved PresStyles. 

Based on the empirical studies we have identified a number of relevant PresStyle 
specifications to be supported. Examples of PresStyles being supported are: Persis- 
tent, Aging, Collage, Emphasize, Surprise, Animated, Conditional Appearance, and 
Context Sensing. Most of these styles apply to both object and the surface per se, but 
for instance Emphasize is only applicable to single objects since the semantics is to 
highlight an individual object. The first four PresStyles directly follow from the way 
people organize themselves with physical materials in homes. The latter explores the 
new possibilities opened up by having dynamic, context aware digital surfaces in the 
homes. The PressStyles are subject to further experimentation and evaluation in an 
iterative design process. 



4 Appliances Utilizing the DoHM Infrastructure 

In this section we describe some design concepts that provide new structuring experi- 
ences in homes as well as new means of shaping the ambience of homes. All are built 
upon the DoHM system 

MediaGates for Piles of Unsorted Incoming Material 

DoHM provides support for uploading of digital material and scanning of physical 
material to the home server. The gate hardware will typically be scanners (for paper), 
FlashRam or USB ports (for pictures and documents), FireWire ports (for video). But 
email, SMS/MMS messages may also be transferred to gate space. The material being 
uploaded will be dropped onto a “gate” folder and canvas acting as temporary space 
for unsorted incoming material. This gate canvas can be associated with places in the 
home, where the family wish to create awareness of recent un-processed material. 
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One such place may be the dining table which in homes is often used as a gate area 
for physical material in transition into other places in the home, e.g. see Fig. 1. In this 
way we may support the lazy structuring witnessed in homes, as seen in Fig. 3. 

MediaTable: Augmented Collaborative Dining Table 

In many families, the dining table is a central place for coordinating activities, ex- 
changing messages, and for organizing incoming materials [3]. In the studies we have 
seen how families leave physical material on the dining table or the common room 
table creating a shared awareness in the home, before it is put away on a shelf or 
pinned to a notice board. When physical material becomes digitized it is often up- 
loaded to or received on a computer somewhere in the home by one family member 
without reaching the attention of the rest of the family. 

In a future home with many digital materials, we wish to be able to support collec- 
tive coordination and organization of materials similar to the physical case. We are 
thus designing the DoHM client to support e.g. a dining table to be augmented with 
top projection and direct tangible interaction. The DoHM client present itself to the 
user as a spatial hypermedia interface, implemented in SVG thus allowing PlaceReps, 
interaction elements and material objects to be turned around to be visible from arbi- 
trary positions around the table, see Fig. 6. A common room table may become a 
digital surface providing touch based interaction. The default view is the scratch area 
with incoming material and visual icons for the registered PlaceRep objects. 

In this way, family members become aware of new materials like incoming mes- 
sages, new photos, new MP3s etc. Family members passing by can for instance drag a 
piece of material to a specific PlaceRep icon to make the material visible one or more 
surfaces in the home shown with the default PresStyle for these surfaces. 




Fig. 6. Left : A scenario with two people using the MediaTable pushing pictures to the TV 
“surface” being watched from the sofa. Right : The DoHM client supporting a collaborative 
picture sorting scenario. Icons on the left are PlaceRep icons. Floating toolbars support object 
interaction. 

In a situation where the family gathers in the living room the members may sit 
down or stand around the table and sort the piles of recent pictures into folders on the 
home server and drag some of them to specific PlaceReps to have them rendered on 
specific surfaces. They may also associate specific digital material with physical ob- 
jects by means of RFID tags as described in [7]. The objects may be printouts of 
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pictures or souvenirs like a concert ticket or a kids’ doll, which then becomes a physi- 
cal link anchor for a picture or a collection of pictures. Thus the table also aims to 
support collective organization of material thus supporting new types of structuring 
experiences as suggested in section 3.4. 

MediaWall: Calm Awareness Support 

In order to exploit the potential of surfaces in the home, we have developed the con- 
cept of the MediaWall. The MediaWall offers the quality of persistently visible mes- 
sage in specific positions, which is preferred in many cases. Maintaining awareness 
about important material or critical events is supported in the DoHM system through 
discrete or calm [28] awareness based on discrete visual appearance. We are inspired 
by the notions of Informative arts such as the work by PLAY [25] as well as Info- 
Canvas [16]. The idea is to provide awareness about important material through an 
artistic collage of the material on one or more surfaces. To support this we take ad- 
vantage of the open hypermedia techniques for anchoring links in arbitrary Web 
pages. Here we let family members make selections on arbitrary Web pages including 
the home server. They can then submit the corresponding anchors to a collage service 
together with a PlaceRep and a PresStyle. The surface identified by the PlaceRep will 
then display the selected text and graphics corresponding to the anchor according to 
some schema with an expression designed to fit the actual room and the inhabitant’s 
preferred level of calmness. 

The collage service is meant to maintain an ongoing awareness of the updated 
view of information identified by the submitted anchors. An active crawler is needed 
to regularly collect information from servers, filtering the parts of the pages needed, 
making an artistic transformation before the representation is dispatched to the sur- 
faces pointed out by the PlaceReps that were associated with the anchor when sub- 
mitted to collage service. 

The MediaGate, the MediaTable, and the MediaWall together present a vision of a 
domestic hypermedia system, which takes into account the specific challenges of the 
household. The MediaGate addresses the challenge of lazy structuring, the Me- 
diaTable offers new structuring experiences, and the MediaWall takes advantage of 
the meaning and richness of the surfaces of the home. We currently have a first im- 
plementation of the key aspects of the three appliances. These will be subject to ex- 
periments and further evaluations in real life homes before a next version is devel- 
oped. 



5 Related Work 

While others have conducted research into future interactive home environments, our 
work differs in various ways from previous. Compared to the studies by Crabtree [2], 
we have covered a broader range of material and media than just the physical mail. 
Compared to related design concepts, we focus on supporting the handling of materi- 
als among people who actually live in the same home, rather than supporting aware- 
ness between people living in different physical locations, which has been investi- 
gated by other projects [18][12], Moreover, we focus on handling of digital materials 
broader than photos [27] and we challenge the position that experiences of handling 
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digital photos in homes can be limited to searching, wandering and recommending 
[27], In contrast, we suggest that digital photos may be important material in shaping 
the ambience of homes, provided that interaction mechanisms are supported that 
respect the qualities of the homes. Our first suggestion is in the form of collective and 
playful experiences around the MediaTable, which supports collective handling of 
materials, which is in opposition to the prevalent, more individualized concepts [27]. 

Compared to the Jigsaw domestic component system [11], we have taken a mate- 
rial centered approach. We have focused on the organization of domestic material and 
on how we can provide a seamless folding between the physical and digital material 
spaces. In achieving that we introduce a novel combination of open hypermedia, 
spatial hypermedia, physical hypermedia, and context-awareness in the DoHM infra- 
structure. Where Humble et al. [11] focus on supporting transformations between 
digital and physical material, we focus on linking and integrating the physical and the 
digital. We do this in terms of the home environment per se by developing support for 
the DoHM system which directly addresses surfaces in the home and let users address 
arbitrary surfaces from every DoHM client running in the home. 

LiMe[23] is a Philips project among other things developing a CafeTable and 
Public Screen concept with access to digital material and the ability to relate it to 
RFID tags. Compared to LiMe, DoHM focuses on the collaborative interaction with 
home materials, and the distributed management of materials on surfaces in the home. 

Spatial hypermedia systems ([8], [15]) have been a source of inspiration, and we 
have extended spatial hypermedia with abstract representations (PlaceReps) of physi- 
cal surfaces and places. A PlaceRep is associated with the composite which holds the 
collection of material to be presented according to the default behavior (specified by 
PresStyles). Thus selected composites in the DoHM system are continually connected 
to physical place or surface. 

In the paper [8], we discuss outdoor geo-spatial hypermedia, and we make a dis- 
tinction between metaphorical and literal spatiality. The DoHM system combines 
support for these distinctions. The PlaceReps are literal in the sense that they repre- 
sent an actual physical place or surface in the home. But it is a deliberate choice not 
to have an exact 2D or 3D model of the house to deal with, since that is far too com- 
plex for an action putting a picture on a specific notice board. Thus we view 
PlaceReps slightly more metaphorical in the sense that we can deal with e.g. “en- 
trance” and have that cover a list of physical locations in the home. This supports 
inhabitants in developing a set of PlaceReps that makes sense to them. In the archi- 
tectural folding terminology, we support the inhabitants in tailoring their own folding 
of the digital environment into the set of installed digital surfaces in the physical 
environment. 



6 Conclusions 

In this paper we have presented studies of the use of primarily physical material in 
private homes. We have illustrated how taking the context of the home seriously 
implies certain challenges for domestic hypermedia. Designing for the pragmatics of 
real domestic life, rather than for some kind of idealized visions of human activities 
and domestic environments implies for instance supporting lazy structuring of do- 
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mestic materials. It implies using the rich set of surfaces of the home as a resource in 
design, and it implies challenging the setting of the individualized personal computer 
and explores how collective and playful structuring mechanisms can be developed. 
Based on the challenges revealed in these studies we have described the design of a 
new Domestic Hypermedia infrastructure called DoHM, which supports the folding 
of spatial and navigational hypermedia into the physical environment of a home. We 
have presented a couple of novel home appliances that take advantage of the DoHM 
infrastructure. We have established the DoHM infrastructure and are initiating ex- 
periments with users introducing the appliances being designed. We have developed 
prototypes of the appliances and the DoHM infrastructure, and they will be evaluated 
both in our lab and in specific home settings within the coming months. 
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Abstract. This paper describes an extension to Ullmer and Ishii’s TUI- 
categorization [41], The reason for adding new categories is based on their 
omission of their associative TUIs and our work in this area of personal objects. 
The benefit of using personal objects instead of generic objects is that in the 
first instance users already have mental models or personal links between expe- 
riences, the related media and these objects. In addition, a Graspable or Tangi- 
ble User Interface with personal objects can support existing media systems, in- 
stead of designing new ones that have to be learned by users. 



1 Introduction 

Over the last couple of years, it has been demonstrated that graspable, or tangible user 
interfaces (TUIs) make up a promising alternative for the ‘traditional’ omnipresent 
graphical user interface (GUI) (see [41] for an overview). TUIs integrate physical 
representation and mechanisms for interactive control into graspable user interface 
artefacts. This physical embodiment provides the user interface designer with new 
and powerful means to free user-system interaction from the confines of the desktop 
and merge it with the physical reality of everyday life. 

The ambient intelligence paradigm of computing envisions a world in which elec- 
tronically enriched environments have become sensitive and responsive to people and 
where electronic devices have disappeared into the background of everyday life. 
Such scenarios imply the need for new interaction styles that better match the skills 
humans use to interact with their natural environment. 

Combining the need for new interaction styles that seamlessly merge into the 
physical world of everyday life and the promising possibilities TUIs offer in this re- 
spect, tangible approaches to user-system interaction seem an obvious choice to study 
in the context of ambient intelligence. Ullmer and Ishii [41] have proposed a concep- 
tual framework that classifies many past and present systems according to a number 
of tangible interfaces characteristics. In recent years, this framework has become 
well-known and has served as a starting point for identifying and discussing the mer- 
its and demerits of TUIs. However, in our own research on tangible user interface for 
ambient intelligent environments ([18], [19], [20], [21]), we encountered difficulties 
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when we tried to classify the ‘tangibles’ that were developed to support the interaction 
between people and their intelligent environments. Consider the following example: 

You are on holiday and you visit this local market where you find a piece of art 
you like. You take it home and put it in your living room and each time you 
walk by you reminisce about that great time you had there. It is as if your 
memories come to life whenever you see this souvenir. And not only the souve- 
nir creates this effect but also your grandfather’ s chair you inherited and the 
vase you got as a birthday present. It seems like objects make it possible to re- 
experience past events. 

In this example, personal objects or souvenirs are automatically linked to memories 
or recollections and it is the owner’s imagination that can relive the experience. An- 
other option is to facilitate this experience by making a digital association with mem- 
ory-cues, like for example digital photos. For a long time, people have been using 
self-made photos to re-experience their holidays, alone or with relatives and friends. 
Recently, digital cameras were introduced which are now rapidly replacing the ana- 
logue ones. Since a digital photo is virtual and difficult to grasp for some people, it 
might help if they could be associated with a physical object, such as the art-souvenir 
example mentioned above. 

In our work on photo browsing in an ambient intelligent home environment, we 
explored the use of personal objects as the tangible part of the user interfaces for a 
Digital Photo Browser ([18], [19], [20], [21]). In trying to classify these personal ob- 
jects using the framework of Ullmer and Ishii [41] we were not successful. It became 
apparent that the Ullmer and Ishii framework needed to be extended to accommodate 
the important class of personal objects in a satisfactory way. 

In this paper we briefly discuss the Digital Photo Browser case study to illustrate 
the potential of using personal objects for interacting with ambient intelligent envi- 
ronments. Next, we review existing proposals for the classification of graspable user 
interfaces, including the Ullmer and Ishii framework. After proposing an extension to 
this framework, we conclude the paper with a general discussion and some conclu- 
sions. 



2 Case Study 

The context of this case study was focusing on autobiographical recollections, i.e. 
support in-home reminiscing. As these recollections are often associated with photos 
and artefacts, it was decided to build a Digital Photo Browser with graspable objects, 
in the shape of souvenirs, as physical carriers for virtual data. (The recently published 
Living Memory Box Project [36] is similar in this respect.) Another reason for using 
artefacts as Graspable User Interfaces was the result of a study [19] showing that sou- 
venirs can serve as memory cues to its owners. Besides, the link between souvenirs in 
a Graspable UI and (digital) photos seems useful since often souvenirs are bought on 
a holiday, a moment in time and geographical location where the buyer of the souve- 
nir also takes digital photos. This automatically establishes a mental link between the 
two types of media. At a later moment in time these digital photos can be easily 
linked to the personal souvenir. The user can do this with the Digital Photo Browser. 
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2.1 Digital Photo Browser 

In short, the Digital Photo Browser is a portable touch-screen device (both a Fujitsu 
Stylistic LT and a Philips DeXscape are used), which is connected via all Mb/s 
wireless LAN link to a “fixed” PC, which runs various components, such as services, 
agents and general applications, which communicate using a service-discovery archi- 
tecture (for detailed specifications, see [8]). In a real-life situation this fixed PC is in 
the attic or study but to the user all functionality appears to be in the portable device. 
In general this Digital Photo Browser can be used for storing, browsing and viewing 
digital photos. Adding new photos to the database is possible by means of an attached 
camera, by scanning or by uploading from other storage media. 

Figure la shows a sketch of the Graphical UI of the Digital Photo Browser, which 
consists of three areas: 1 - an area on the left which shows a moving photo roll, 2 - a 
central area which allows enlarging individual photos, both landscape and portrait, 3 - 
an area on the right where icons of the current user (3a), other display devices (3b) or 
detected graspable objects (3c) can be shown. In this context only area 3c is relevant, 
since it will show any graspable object detected by the system. 




Fig. 1. (a) A sketch of the Photo-Browser user interface (for an explanation see text), and (b) 
the Chameleon Table and some tagged souvenirs. 



2.2 Personal Souvenirs 

How are these souvenirs used together with the Digital Photo Browser. First they are 
tagged with RFID-tags [33] and a special table (called the Chameleon Table) was 
constructed to detect those RFID-tags and to make clear to the user where and when 
tagged objects are recognized (see Fig. lb). Once an RFID-tagged souvenir is recog- 
nized by the table, auditory feedback is provided and a thumbnail of the souvenir ap- 
pears in area 3c on the GUI. Simultaneously, the content of area 1, which shows the 
moving photo roll, changes to the photos linked to the souvenir. New photos can be 
associated with the object by dragging thumbnails or enlarged photos to the icon of 
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the object. Adding photos to the souvenir is done by dragging them from anywhere on 
the screen of the Digital Photo Browser to the icon of the souvenir, e.g., from photo 
albums displayed in the photo roll. 

For this prototype the table only detects one object at the same time. This was de- 
cided not because of technical limitations but because otherwise it would require an 
object language, in order to explain which logical operators were used. This would 
make the object-device interaction less straightforward. 

For more details on this case study, the working prototypes (see Fig. 2) or evalua- 
tions, see [19]. 




Fig. 2. The Digital Photo Browser, the Chameleon Table and some graspable objects. 



3 Graspable User Interface Categories 

Ullmer and Ishii [41] published their “emerging frameworks for tangible user inter- 
faces”, giving a complete overview of different aspects of TUIs, amongst others de- 
scribing an interaction model, application domains and an overview of four categories 
of TUI-instances. Those four categories can be divided into two groups, namely one 
group in which physical objects are used independent of each other and the second 
group consists of groups of physical objects which together create an object language. 
The independent object-group is called “associatives ” , representing physical objects 
which are individually associated with digital information (e.g., each sticker repre- 
sents a URL such as in Ljungstrand et al., 2000), such as the souvenirs described in 
Section 2. The second group contains physical objects that rely upon other physical 
objects to create added value or meaning, such as spatial interpretations (e.g., a 2D 
layout such as in Buildlt [12]), constructive (e.g., for building 3D physical models, 
such as LEGO-like Blocks [4] or relational ones (e.g., creating temporary relations 
between different physical objects, such as annotating videos with blocks [7], 

The physical objects which form the graspable part of the user-interface are termed 
“iconic”, by Ullmer and Ishii [41], when they share representational characteristics 
with their digital associations, or “symbolic” when those physical objects do not 
physically represent some property of the digital information. 
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Other relevant definitions related to Graspable/Tangible UI come from Holmquist, 
Redstrom and Ljungstrand [17]. They came up with the names “containers”, “tools”, 
“tokens” and “faucets” for four types of physical objects. Containers are generic 
physical objects that can be associated with any type of digital information and the 
physical shape gives no clues on the associations made with the digital information, 
besides they are primarily used to move information between devices or platforms. 
Tools also do not show a relationship between the physical information and the digital 
information; they are physical objects that can manipulate digital information, which 
means they often represent computational functions. Tokens, on the other hand, do re- 
flect with their physical appearance the digital information associated with it and they 
are used for accessing stored information. Faucets are devices that can present the 
digital information associated with tokens. Apparently, the authors see “reading the 
information from the tokens” and “presenting the information from the tokens” as 
something done in the same device, whereas the work in Section 2 shows it can be 
two different devices (respectively a table and a display), as well. Holmquist, Red- 
strom and Ljungstrand [17] also used the term “overloading”, by which they mean 
that one token might be associated with more than one piece of information. This 
overload of information might require the token to be location or context sensitive, 
showing particular pieces at particular locations or particular contexts, or a user might 
be able to access several pieces of information at the same time by means of a faucet. 

According to Ullmer and Ishii [41] their term iconic is similar to Holmquist et al.’s 
[17] token and symbolic is similar to container. Only the meaning of the terms iconic 
and symbolic are limited to the physical representation of the associated digital in- 
formation, whereas the terms token and container also are defined as giving informa- 
tion on the function of the objects as well (respectively: accessing stored information 
and moving information between devices and platforms). 

Dourish [9] describes a categorization of “meaning-carrying” of Tangible User In- 
terfaces. He starts by subdividing the objects into iconic and symbolic and the mean- 
ing of the objects is identified as either related to other physical objects or to actions. 
This can get complicated. Imagine, for example, the souvenirs of section 2.2. These 
souvenirs are objects, which link to memories mentally (perhaps those memories 
contain both objects and actions), but virtually those souvenirs link to digital photos 
that can be displayed by performing an action with the object. Dourish calls this “a 
blend of properties” (p. 168), which has potential but should be worked out in more 
detail. 

A recent paper by Ullmer et al. [44] mentions token+constraint interfaces, where 
the constraints are regions that map tokens to digital information. An example of such 
a token+constraint system is the Chameleon Table together with souvenirs (see Sec- 
tion 2.3), although it is different from Ullmer et al.’s description since a token is used 
to associate and manipulate the constraints. In this example a token associates, the 
constraints are manipulated by another physical object (the Digital Photo Browser) 
and the table sets the constraints by its color instead of movement or physical location 
of the tokens. 
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4 Graspable User Interface Category Extension 

Although the division into “object-language” categories is extremely useful [41], it 
misses out some of the other dimensions of Graspable UIs that are particularly im- 
portant for the associative-TUIs group. And since Ullmer and Ishii left out this cate- 
gory in a later publication [42] because they were “less confident of the utility of this 
category” [41], an extension to the categorization is proposed in this paper which both 
involves the spatial, constructive and relational TUIs as well as the associative TUIs 
invented by [41]. 

The extension of the TUI-instances or Graspable UI categories is based on the 
idea that users of personal Graspable UI objects have an existing mental model of the 
links between their personal physical objects and the associated digital information. 
A mental model in this paper stands for a link between objects and media that is not 
determined by the object’s physical properties, but by past events known to the user in 
which these objects played a role, such as buying a souvenir in a far-away country or 
leaving usage traces on a piece of furniture during that party you also have photos of. 
Later those physical properties might remind the user of the links. This definition, for 
example, would exclude a physical object that looks like a book which is assumed by 
a user to have stories attached to it. After some experience with this book the user 
does have a mental model of the object and its associated media, but these relation- 
ships were not present from the beginning, they were learned. 

Examples of studies in this area include the souvenirs mentioned in Section 2, but 
also Rosebud [13] 1 14] POEMs [39] and Passage [37], The use of these Graspable UIs 
is more suitable for novice instead of expert users, according to the negligible learn- 
ing time needed to create an internal association with an external object or a mental 
model. Therefore, the category extension starts with a subdivision of physical objects 
in “physical objects which have personal meaning to the user” (where the user is 
probably also the owner) and “physical objects that do not have personal meaning to 
the user” (where the user is “only” a user). This distinction immediately shows that 
the first group in particular seems very suitable for the home environment, since this 
is the place where most people keep their personal belongings. The second group is 
more suitable for expert users, since they are more willing to learn the relationships 
between physical objects and their digital associations. Therefore this group seems 
more useful in the office environment. Physical objects in the first group are mostly 
used by one person whereas objects in the second group can be used by a group of 
people. 

The second subdivision is made based on the concept of “dynamic binding” [41], 
which means that digital associations can be created and thus changed or deleted by 
the user. One group of Tangible UIs does not support dynamic binding; they are 
termed “fixed” associations, while the other group can have “flexible” associations. It 
turns out that the examples of the Tangible UIs with fixed associations always have 
only one association, but that the flexible group often supports overloading (see Table 
1 for examples). Both of these groups can be subdivided into “random” or symbolic 
associations, which for “random” associations means that the physical properties of 
the object do not represent the digital properties in any way, and for “meaningful” or 
iconic associations that there is a relationship between the physical and the digital in- 
formation. 
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All physical objects with a fixed association appear to fall under the tool category 
as defined by Holmquist et al. [17]. Therefore, this group can be subdivided into sym- 
bolic tools (e.g., music coming out of a bottle when open in musicBottles) and iconic 
tools (e.g., a glass lens-shaped object, which functions as a lens for beams of light in 
Illuminating Light [45]). 

Fitzmaurice [10] came up with defining properties of Graspable User Interfaces, of 
which the following property seems to be most relevant for this paper: 

Both input and output of such an interface should be space-multiplexed instead 
of time-multiplexed. Space-multiplexed indicates that every function (on- 
screen) has a physical device associated with it. On the other hand, the time- 
multiplexed PC-mouse is constantly reassigned to a new function (strong- 
specific versus weak-general devices). 

According to Fitzmaurice [10] physical objects that support more than one function 
are time-multiplexed, which makes them fall outside the definition of a Graspable UI, 
which should be space-multiplexed. Perhaps it is interesting to note that space- 
multiplexed means that each physical object can only have one function, but it can 
contain more media files at the same time. Take, for example, the Rosebud system 
[13] [14], where each stuffed toy can contain one or more stories told by the owner, 
but it still has this one function: retelling stories. 

Although Holmquist et al.’s [17] token was meant to be “iconic”, based on its 
function, which is accessing stored information, a new subdivision can be made, 
namely symbolic versus iconic tokens, because examples of both types of tokens ex- 
ist. E.g., in the mefaDESK system blocks are designed to look like miniature build- 
ings because they represent them [40], and because these objects show their link be- 
tween physical appearance and digital information they are iconic tokens. Symbolic 
tokens only reflect to its current users the associated digital information, e.g., attach- 
ing a TV-show to a pen [35]. Holmquist et al. [17] used the term container for both 
“symbolic associations” and for supporting overloading, which does not fit in the dis- 
tinction made in this paper that flexible associations can support one or more associa- 
tions and thus do not always support overloading, therefore the term container is not 
taken forward in this categorization. 

In Table 1 all this information can be found, together with some examples of 
Graspable/Tangible UIs. In this table the different categories by Ullmer and Ishii [41] 
can also be placed, namely: constructive and relational TUIs belong in the box ge- 
neric symbolic tools. The spatial TUIs belong in the box generic iconic tools and ge- 
neric symbolic tokens, and the associative TUIs belong in the four boxes with flexible 
digital associations. 

One Graspable UI can contain physical objects from more than one category, al- 
though the majority only consists of one category, exceptions appear in the spatial and 
relational categories by Ullmer and Ishii [41], e.g., SiteView [5], which uses generic 
symbolic tools (e.g., the “rain interactors” is a generic object showing text) and ge- 
neric iconic tools (e.g., the “lights on interactor” is an object shaped as a light). The 
Senseboard by Jacob et al. [25] contains both generic symbolic tools and generic 
symbolic tokens. The tokens are fridge magnet-like blocks that can be linked to one 
conference paper on the Senseboard; the tools can be used to execute commands on 
the tokens, such as “copy” or “link”. And a third example, Buildlt [12] contains both 
generic iconic tools (a camera-shaped brick is used for determining a camera view 
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onscreen) as well as generic symbolic tokens (rectangular bricks represent handles to 
pieces of furniture). 

Table 1. An extension to the TUI-categorization by Ullmer and Ishii [41], with two dimensi- 
ons: the type of physical objects and the type of digital associations with the physical object. 
The numbers between brackets are the number of associations possible with each physical ob- 
ject at the same time and n = |n|. 



Digital associations 


Fixed (1) 


Flexible (n) 


Symbolic 

(tool) 


Iconic 

(tool) 


Symbolic 

(token) 


Iconic 

(token) 


Physical object type 


No existing 

mental model, 

mostly multiple users = 

Generic object 


Bricks (1) 
DataTiles (1) 
FRIDGE (1) 
Logjam (1) 
MetaDESK (1) 
musicBottles (1) 
MusiCocktail (1) 
Navigational 
Blocks (1) 
PingPongPlus (1) 
Senseboard (1) 
Site View (1) 
Soundgarten (1) 
Task Blocks (1) 
Triangles (1) 

Urp (1) 


ActiveCube (1) 
Buildlt (1) 
Illuminating 
Light (1) 
Lego-like 
Blocks (1) 
metaDESK (1) 
Robotic toys (1) 
SenseTable (1) 
Site View (1) 
Urp (1) 


Buildlt (0/1) 
InfoStick (n) 
MediaBlocks (n) 
“memory 
objects” (n) 
MusiCocktail 
(n) 

Rosebud (n) 
Senseboard 
(0/1) 

TellTale (0/1) 
Triangles (0/1) 
WebStickers (n) 
WWICE (0/1) 




With existing 
mental model, 
mostly single 
user = 

Personal object 






Passage (0/1) 


POEMs (n) 
Phenom (n) 
Living 

Memory Box 
(n) 



4.1 Generic Graspable User Interfaces 

Generic Graspable User-Interface objects are mostly designed for office environ- 
ments. For example, blocks can be used as tools to control specific PC-functions, such 
as Lego-like Blocks [4], Navigational Blocks [6], Bricks [11], Buildlt [12], Sense- 
board [25], ActiveCube [26], SenseTable [31], DataTiles [34], Task Blocks [38], 
metaDESK [40], Urp [46] and FRIDGE [47]. Besides for tools generic Graspable-UI 
blocks can also be used as tokens that contain information or files, such as infoStick 
[27], MediaBlocks [43], WWICE-tokens [35] and WebStickers [28]. Examples out- 
side offices include toys, such as TellTale [3], Triangles [15], PingPongPlus [24], ro- 
botic toys [30], Rosebud [13] [14] and “memory objects” [16]. Besides toys a number 
of papers focused on audio or video applications in the home, such as Logjam [7], 
musicBottles [22], MusiCocktail [29] and soundgarten [48]. 
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4.2 Personal Graspable User Interfaces 

Most personal Graspable UI objects, which are all tokens, can be found in home envi- 
ronments, with the exception of the personal symbolic tokens of Passage [37], Since 
Graspable User Interfaces in the home can be used by any type of user, e.g., people 
who do not have any PC-experience, the system and in particular the graspable ob- 
jects should make clear, in one way or another, how they should be used. It is most 
clear to the user if she can come up with the digital associations and the graspable 
objects, such as in POEMs [39], the Living Memory Box [36] and the souvenirs of 
Section 2. 

Although Ullmer and Ishii [41] do not address “overloading” in their “emerging 
frameworks”-paper, this concept might explain partly the usefulness of their so-called 
“associative” category (besides the statement already made about the use in-home). 
Ullmer and Ishii state that they “are less confident of the utility of this category than 
those we have considered thus far. Nonetheless, the instances we have identified do 
seem to exhibit some consistency, suggesting the category may have merit”. (The 
categories considered thus far stand for the spatial, constructive and relational sys- 
tems.) 

To the authors’ knowledge all personal iconic tokens of the “associative” systems 
support “overloading”, including the Graspable UI described in Section 2 of this pa- 
per, which incorporates souvenirs that link to related media-items. This might explain 
why the iconic tokens of “associative systems” are not used for complex tasks in- 
volving other tokens: the digital interface is already complex enough with multiple 
pieces of information associated with each token. 



5 Discussion 

The strength of the extension proposed in this paper is that it includes Graspable UIs, 
which make use of existing everyday graspable object like the personal souvenirs 
people have in their homes. The need for this category of Graspable User Interfaces is 
supported by recent views on the future of computing, such as Ambient Intelligence 
[ 1 ] [2] . These visions state that in the future many networked devices will be inte- 
grated in the environment and the numerous examples of personal tools and tokens 
that are featured in these future scenarios show that this can be done with personal 
objects people already have. 

Therefore, an interesting area of future research would be the personal object- 
group. Currently many case studies start with personal objects and later “upgrade” 
them with a link to digital information. Is it possible to do this the other way around, 
or will this inhibit the personalization of the object? And why are the personal-fixed 
toolboxes empty, because the field is not yet mature, does not exist long enough? One 
can imagine a personal tool such as a bowl, which represents the user. Each object, 
e.g. a souvenir like the ones described in Section 2, that is placed in this bowl links to 
the digital associations created by the “bowl owner”, since one souvenir can have dif- 
ferent associations for different users. If the bowl represents its filtering function in 
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one way or another, e.g., by text, icons or perhaps shape, it would be a personal iconic 
tool, and if it does not it would be a personal symbolic tool. 

An interesting finding is that all the Graspable UIs mentioned in the personal 
iconic token box (Table 1) appear to make use of external memory (a subset of dis- 
tributed cognition, see e.g. [32]), although none of the papers mentions this explicitly. 
This is not possible with generic objects, since they are all alike, but it is convenient 
for personal objects, because the mental model is created by the user himself and not 
imposed by the system. Therefore this group of objects seems very suitable as re- 
minders. Another interesting finding is that currently no Graspable UIs are known in 
the personal fixed categories. 

The Generic Iconic Token box in Table 1 shows no examples, perhaps because an 
object that is flexible in its associations can contain several links but also several 
types of media and it is hard to represent an ever changing media type. 

Another remark concerns Dourish [9] who talked about dividing Tangible UIs ac- 
cording to the meaning they carried, on a scale from objects to actions. This scale 
might be most useful for the generic objects presented in this paper, since they have a 
unified meaning to users. The personal objects with predefined mental models might 
be difficult to fit in this subdivision. 



6 Conclusions 

This paper explains a possible extension to the TUI-categorization by Ullmer and 
Ishii [41]. The extension is based on the idea that users of personal objects have an 
existing mental model of the links between their personal physical object and the ac- 
companying digital information. To the authors’ opinions this extension is valuable, 
since the associative TUIs fit in, which Ullmer and Ishii [41] found hard to categorize. 
Furthermore, the benefit of using personal objects instead of generic objects is that in 
the first instance users already have mental models, and the Graspable or Tangible 
User Interface can support existing object systems, instead of designing new ones that 
have to be learned by users. 
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Abstract. We report research into concepts and technology for enabling end- 
users to configure Ambient Intelligent environments. In this paper we focus on 
the feasibility and acceptability of this endeavor from an end-user perspective. 
We describe a conceptual model and an experimental enabling technology that 
illustrates the viability of these concepts and a multi-faceted evaluation of these 
concepts from an end-user perspective. Our work suggests the need for a flexi- 
ble approach in letting users choose how much should be observable of system 
structure and function or of the processes of system learning and adaptation. 
Directions for future research in this field are described in the form of some 
provisional principles for shaping the interaction with end-user configurable 
Ambient Intelligence environments. 



1 Introduction 

The vision of Ambient Intelligence (Ami) promises that the environments where we 
work, relax or commute will be furnished with an increasing number of computation- 
ally augmented artifacts. Ami technology must fit seamlessly into the lifestyle and 
life-patterns of very different individuals and to adapt to situations and configurations 
unforeseen by their designers and developers. One potential solution is to support 
users to construct and customize their computational environments, as argued in [7]. 
This solution offers the benefits of an incremental and personalized construction of 
Ami environments, which empowers end-users. 

The concepts and the technology discussed below extend the notion of component- 
based software architectures to the world of physical objects. Objects in peoples’ 
everyday environment are augmented with autonomous computational artifacts, the e- 
Gadgets, which can be used as building blocks of larger systems. The computational 
environments formed by such artifacts are intended to be accessed directly and to be 
manipulated by untrained end-users. 

Apart from the serious technical challenges pertaining to this vision, that are dis- 
cussed in [4], an important research question that emerges is whether untrained end- 
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users will be capable and inclined not only to program individual interactive systems 
but also to configure the environments they live or work in. The extrovert gadgets 
project (e-Gadgets) has developed concepts, a model and a prototype implementation 
to support this activity (see www.extrovert-gadgets.net). Below we describe this tech- 
nology briefly and we focus upon an evaluation of the e-Gadgets concepts from a 
user perspective that tries to answer the research question raised. 



2 Concepts and Terminology 

An e-Gadget is defined as an everyday physical object enhanced with sensing, actu- 
ating, processing and communication abilities. A GadgetWorld is a functional con- 
figuration of associated e-Gadgets that collaborate in order to realize a collective 
function. 

GAS-OS is the middleware that enables the composition of GadgetWorlds [4], It is 
a component framework that manages resources shared by e-Gadgets, determines 
their software interfaces and provides the underlying mechanisms that enable interac- 
tion among e-Gadgets. The current version of GAS-OS is written in Java. GAS-OS 
supports IP-based communication using (without being bound to) IEEE 802. 11 g. For 
the prototype implementation, we used iPAQ handheld computers to execute GAS- 
OS. A special board has been designed that includes the hardware required to inter- 
face GAS-OS with the sensors embedded in artifacts. 

A software tool, the GadgetWorld editor, has been developed to facilitate the com- 
position of GadgetWorlds. The purpose of the editor is threefold: (1) to indicate/make 
visible the available e-Gadgets and GadgetWorlds (2) to form new GadgetWorlds (3) 
to assist with debugging, editing, servicing, etc. Two versions of the editor have been 
created. One with richer functionality runs on a laptop personal computer and is 
intended for the e-Gadget ‘professional’ designer. The second and simpler one runs 
on an iPaQ handheld computer and is intended for the untrained end-user. 

GAS-OS supports the composition of e-Gadgets, without having to access any 
code that implements their interfaces. This approach separates the computational and 
compositional aspects of an application, leaving only the second task to the end-user. 
In this way, domain and system concepts are specified in the generic architectural 
model and are offered ready to the application designer and the end-user- 
programmer. 

Plugs are software classes that make an e-Gadget’ s capabilities and services visible 
to people (through an editor) and to other e-Gadgets. For example, some of the serv- 
ices that the alarm clock can provide are: time, hour, minute, day, alarm on/off, 
sound; the lamp can provide such plugs as lamp on/off, light level. Composition is 
effected through the definition of synapses (links) between pairs of two (compatible) 
plugs. (See Fig. 1). 

For example, consider a person who wants to achieve the following collective be- 
havior from the objects in his/her environment: when the alarm clock sounds, the 
lamp on the ceiling of his/her room should be automatically switched on and the 
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plug 




Fig. 1 . E-Gadgets represented as circles, making synapses through their plugs to form 
GadgetWorlds 

heater should start preparing the water for a shower. A novice e-Gadgets user could 
associate the alarm on/off plug of the alarm clock to the on/off plug of the lamp and 
to the on/off plug of the heater; a more experienced user could also use the light-level 
plug of the lamp or the temperature plug of the heater, thus programming the behav- 
iour associated with a synapse. 

Physically, the Plug/Synapse model is implemented with a peer-to-peer architec- 
ture. Each end of a synapse is managed by the GAS-OS running on each e-Gadget. A 
synapse serves as the abstraction of a communication channel between peers. The 
GAS-OS of the two e-Gadgets participating in a synapse will exchange events and 
data in a way specified by the adopted communication protocol; however, this occurs 
only when they have ‘discovered’ each other. Discovery is twofold: (1) on demand by 
an editor and (2) proactively carried out by an e-Gadget, after a synapse request. For 
example, if the light bulb of the ceiling-lamp is burnt-out, then the alarm clock e- 
Gadget may look for another e-Gadget that offers the lighting service; it can then 
switch this light on when the alarm goes off. 

One important function of the editor is to identify and present to the user the e- 
Gadgets in its vicinity. It also helps inspect the capabilities offered by each device. 
Such capabilities have a direct relationship with the actuating / sensing capabilities of 
the objects and the functions that are intended by the appliance manufacturers. Some 
of these capabilities, especially the more complex ones, may not be obvious to peo- 
ple, apart from via the editor. In addition, the editor identifies the current configura- 
tions of e-Gadgets in its vicinity and displays them for supervision. The links between 
the compatible capabilities of appliances are visualized and can be manipulated 
through the editor. Associations between certain capabilities of appliances/objects can 
be formed thus creating configuration sets for a certain purpose. Once such a set of 
Synaptic associations is established, the part of the operating system that runs on each 
participating device will ensure proper operation. The set that is created remains op- 
erational as long as is required unless there’s technical inability to maintain its func- 
tionality. Eventually a user may deactivate a GadgetWorld by destroying related as- 
sociations. 

Parallels can be drawn between the e-Gadgets editor and the Interstacks interface 
discussed in [6] for enabling end-users to integrate specialized hardware devices and 
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to control the flow of information amongst them. Both projects aim to support end- 
user configuration of a set of hardware components and component languages inspire 
both. Putting aside some significant conceptual differences from an Ami perspective 
that pertain to our ambition to convert every-day objects to computationally enabled 
artifacts, we note that from a user perspective there will be very similar concerns in 
designing appropriate editor interfaces. As no evaluation from a user perspective of 
the Interstacks project has been published to date, we hope that the evaluation dis- 
cussed in section 3 will provide useful lessons for the design of editor interfaces like 
that of Interstacks or other comparable platforms. 

In the course of the project 12 sample e-Gadgets have been created. Further, at the 
University of Essex, within the iDorm space (a laboratory resembling a student dor- 
mitory) several furnishings and devices run the e-Gadgets architecture. These e- 
Gadgets have been used as a test implementation of embedding the proposed platform 
into everyday objects. 

Apart from users composing GadgetWorlds manually through an editor, the project 
has created prototypes where intelligent agents learn from people’s use of a prede- 
fined GadgetWorld and proactively modify synapses between gadgets. 



2.1 Example GadgetWorld Scenario 

Lets assume a story involving the use of everyday artifacts. John 21, a student in 
economics, has recently created his Study Application (a simple Ami environment), 
with a new GadgetWorld he purchased. He wants his light to turn on automatically 
when he studies at his desk. This functionality can be supported by a GadgetWorld 
consisting in four e-Gadgets: a desk, a desk-lamp, a book and a chair. The book is 
equipped with a light sensor whose reading is made available through the plug 'lumi- 
nosity’. When the lighting is lower than a certain threshold and someone sits on the 
chair and the book is open and is on the desk, the synapse with the on/off plug of the 
lamp will cause the lamp to switch on. 



3 Evaluation of E- gadgets 

An expert review workshop and an analysis based on the Cognitive Dimensions 
framework [2] were carried out to assess the concepts prior and during the prototype 
implementation. When a working system became available we sought expert feed- 
back during demonstrations and we organized a formal evaluation where test-users 
created and modified their own GadgetWorlds. This work is summarized below. 



3.1 Expert Review 

Three invited experts in human-computer interaction (academics) participated in an 
early evaluation workshop, which had a general format of a focus group discussion. 
First, experts discussed the general concepts of the e-Gadgets project and a collection 
of 4 scenarios. Then they were given a problem-solving exercise for designing their 
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own GadgetWorld on paper. This exercise assumed 4 e-Gadgets: a desk, a lamp, a 
chair, a mat and a MP3 player. The experts sketched solutions for controlling the 
lamp and the players through the other e-Gadgets. After they designed such configu- 
rations they were asked to comment regarding the anticipated usage and acceptance 
problems of e-Gadgets technology. We then demonstrated two ‘horizontal’ prototypes 
of a GadgetWorld editor, i.e., providing an overview of the system rather than a fully 
functional segment of it. One was an HTML prototype running on a laptop and the 
latter was a video prototype demonstrating possibilities for a tangible interface to the 
eGadgets editor. 

A wealth of formative feedback was generated in the expert workshop. In broad 
terms, the experts were concerned about how users would observe the invisible 
boundaries of GadgetWorlds and the logical connections between physical objects. 
Skepticism was expressed regarding the technical complexity given to users and the 
acceptability of agent technology in modifying the environment where we live and 
work in. 

An interesting observation regarding the abstractions adopted by the project, was 
that human activity is modeled through information and behavior of e-Gadgets 
equipped with sensing behavior. On the other hand, humans themselves are a central 
part to any description of human activity and the context of operation for the e- 
Gadgets, so they should appear as “first class” abstractions in a vocabulary for de- 
scribing and configuring Ami environments. 

3.2 Analytical Evaluation Using Cognitive Dimensions 

The Cognitive Dimensions framework [2] is a “broad-brush” technique for evaluating 
information artifacts, e.g., notations and interactive systems. It helps expose trade-offs 
made in the design of information artifacts with respect to the ability of humans using 
them to capture their concepts and intentions and to manage and comprehend the 
artifacts they create. Some of the most important conclusions from this analysis are 
summarized below, noting the relevant cognitive dimension where appropriate. 

• The GadgetWorld editor should aim to bridge the gap between architectural de- 
scriptions of a GadgetWorld and the user’s own conceptualizations, that might be 
rule-based, task-oriented, etc. (improving the Closeness of Mapping dimension). 

• E-Gadgets require few conventions to be learnt (low terseness) and have an ab- 
straction gradient favoring the non-trained programmer. 

• E-Gadgets introduce hidden dependencies between the behaviors of apparently 
unrelated objects that should be made observable through the editor. 

• An object may belong to several GadgetWorlds and its function is difficult to 
understand from its physical appearance (low Role Expressiveness). 

With respect to the last two points, this analysis corroborated the opinion expressed 
by the experts that untrained users of e-Gadgets are handed programmers’ tasks, so it 
would be inappropriate not to provide them with corresponding tools that help pro- 
grammers carry out those tasks, e.g.. libraries, debuggers, object inspectors, etc. 
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3.3 Demonstration Feedback at Conferences 



The e-Gadgets technology was demonstrated at two international events attracting 
experts in human-computer interaction. Around 30 delegates experienced the demon- 
stration at the “TALES of the Disappearing Computer” (Santorini, Greece, May 
2003) and 10 completed the feedback forms. This event attracted delegates studying 
aspects of Ambient Intelligence (e.g., computer scientists, industrial designers, hu- 
man-computer interaction experts) representing both industry and universities. Ap- 
proximately 70 people visited the demonstration at the British HCI conference (Bath, 
September 2003) and we received 29 completed feedback questionnaires. The latter is 
a very specialized venue for Human Computer Interaction researchers, primarily 
representing the academic world. 




Fig. 2. The GadgetWorld Editor running on an iPaQ 



The demonstration featured the handheld editor running on the iPaQ that supported 
the discovery of 3 e-Gadgets: a Mathmos “Tumbler” light (that resembles a luminous 
brick), a MP3 player and a pressure sensitive floor-mat. After a short explanation of 
2-3 minutes, delegates were able to create a GadgetWorld, e.g., for controlling the 
volume and genre of music played from the position of a person on the mat or by 
flipping the Mathmos Tumbler on its different sides. 
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A wealth of comments was collected by written questionnaires; here we present 
only the general reaction to the e-Gadgets concepts, rather than more detailed com- 
ments about the demonstration set-up, the specific implementation or minor ‘usability 
bugs’ in the editors graphical user interface. 

Some comments by respondents were conflicting: 13 delegates found that this 
technology will not be used because it is too complex, while 16 noted as a positive 
impression that is very easy to create and modify GadgetWorlds or that it is very easy 
to learn. Clearly, both are reasonable expectations that depend on the context of de- 
ployment and the targetted users. A lot of controversy was caused by the use of intel- 
ligent agents as an aid to configure environments. Some experts with Human Com- 
puter Interaction expertise rejected agent technology outright, others were enthusias- 
tic and others pointed at some well-known caveats for adaptive systems from an end- 
user perspective (pertaining to the loss of control caused by agents as a result of re- 
duced predictability and observability of system behaviour). 5 respondents suggested 
that a PDA is not a good platform to run the editor on, as it offers very limited display 
size. A couple of respondents pointed at two issues that run deeper into the concepts 
of the e-Gadgets project; 

• The e-Gadgets abstractions refer to the system structure rather than the tasks of 
the user, so they have the onus of translating their intentions into architectural 
elements 

• In actual life, an editor like the one shown during the demonstrations, would 
serve not only to inspect and to form one’s own Gadgetworlds, but for people to 
inspect and understand Gadgetworlds purchased ready made or made by another 
member of the household. This means that a Gadgetworld architecture should 
not only be understandable to its creator but to other individuals as well. 

A general point that relates to the latter observation is that the tasks of comprehending 
and modifying pre-defined configurations of components should be included in user 
testing of configurable Ami environments. 



3.4 Evaluation at the iDorm 

The iDorm facility is a specially constructed student dormitory that has been set up 
within a computer laboratory at the University of Essex, for experimenting with 
sensing technologies and intelligent agents. It is equipped with several sensing and 
actuating components, which for this study were controlled through GAS-OS. 

The user study was a combination of short tests and a single trial that took place 
overnight. The short tests aimed to gauge how potential users grasp the fundamental 
concepts of e-Gadgets described above and whether they can create or modify their 
own GadgetWorlds. The overnight trial aimed to get a more realistic test of e- 
Gadgets. The user had to experience, albeit for a short time, the effects of ‘program- 
ming’ the environment. Because of practical constraints, we did not attempt repeated 
overnight tests, which however would have been preferable from a research perspec- 
tive. 




250 



P. Markopoulos, I. Mavrommati, and A. Kameas 



Participants 

3 pairs of paid participants were recruited locally for the short tests. We looked for 
‘technophile’ users with familiarity to computers. The pre-test questionnaire showed 
that participants were university students, 4 of whom computer science students, who 
were not familiar with the project. Only one participant was over 35 years of age. 
Only one participant was not a mobile-phone user. The rest were frequent users of 
personal computers, e-mail, SMS and mobile phones. In summary, all participants 
had a higher level of education and familiarity with computing than one would cur- 
rently expect from the general public but pretty representative of the capabilities of 
potential early adopters for e-Gadgets technology. 

Materials 

The following e-Gadgets were made available for the user test: 

1 . Occupancy: Senses if the room is occupied or not. 

2. LightLevel: Measures the ambient light in the room. 

3. Chair: Senses if someone is sitting on it or not. 

4. Bed: Senses if someone is on the bed or not. 

5. Temperature: Senses the room temperature. 

6. RoomLights: Switches room-lights on and off. 

7. DeskLight: Switches desk light on or off. 

8. BedLight: Switches bed-light on or off. 

9. Blinds: Opens or shuts (completely) the blinds and lets you set the angle of the 
blades. 

10. MP3 player: Sarts or stops playing music, sets the volume and lets the user 
choose a genre of music. 

11. Clock: Tells the time or raises an alarm. 

One of the authors acted as a facilitator and the other as an observer/note-keeper. 
The experimenters introduced the subjects to the experiment, explained the set-up and 
the nature of their involvement and obtained informed consent for videotaping. Par- 
ticipants filled in a pre-session questionnaire, describing their familiarity with com- 
puter technology in general and more specifically with handheld devices. A brief oral 
explanation plus a minimal demonstration of the system was then provided. 

Participants were given the editor running on an iPaQ handheld computer (see Fig. 
2). This editor supports discovery of devices that appear as a list. Through a set of 
pop-up menus operated with the stylus, the user can connect the plugs of two e- 
Gadgets to create a synapse. After creating a few synapses the user can actually run 
the configuration made. Note, that while some of the tasks involved control of light, 
sensing of movement, etc., which are typical examples for home automation our em- 
phasis was different. We did not wish to test the acceptance of the home-automation 
functionalities used or to compare against purpose-specific software like X10. Rather 
we wanted to assess the way in which e-Gadgets are put together to form functional 
configurations. 

One of the two participants would take control of the editor and was given advice 
how to operate it. Tasks were given one by one in cards. Due to technical malfunc- 
tions we adapted some tasks on the fly (during the tests the Blinds and the BedLight 
e-Gadgets ceased to operate). After the first task had been completed in this way, this 
‘trained’ participant explained the operation to their peer who performed tasks 2-5 for 
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Fig. 3. The experimenter instructs a participant how to execute the first (training) scenario. 
Then she takes over to explain the editor to the second participant. Subsequently, the second 
participant will take over the editor. 



the experiment (see Fig. 3). This procedure is an adaptation of the Co-Discovery [5] 
and Peer Tutoring [3] methods for obtaining verbalisation data from usability testing. 
Getting this data was crucial for our case, where it was difficult for us to observe what 
they were doing on the handheld device (because of the small size of the screen). 
Some of the functionalities they managed to create were, for example, changing the 
genre of music depending on whether they would sit on the chair or not, switching the 
light off when there was nobody in the room, etc. 

A mini-structured interview was conducted at the end of the session after which 
participants filled-in a written questionnaire with similar questions. This format (in- 
terview based on questions and then completing the questionnaire) was adopted to 
make sure questions were understood, to encourage participants to bring out opinions 
in the open but also so that they would use the opportunity to better formulate their 
thoughts in writing after the discussion. 

Results of the iDorm Test 

The short evaluation sessions went very smoothly, despite occasional minor technical 
failures. The overnight test was less successful, due to a network failure so few con- 
clusions can be drawn from it, other than the importance of robustness and graceful 
degradation of Ami systems. 

Test users, including the non-computer science students, were surprisingly capable 
in completing their tasks and surprised us with their enthusiasm. Participants reported 
that they enjoyed the type of programming activity; one mentioned that she liked 
“messing around with her furniture”. We could also note the enthusiasm in their non- 
verbal behavior. All participants, despite some shortcomings of the editor’s graphical 
user interface, found configuring a GadgetWorld straightforward. 
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This apparent contradiction to the opinion of several experts can be explained in two 
ways: 

1. Our test participants were unusually well educated and familiar with technology 
(university students, some trained in computer science) 

2. HCI experts overestimated the difficulty of concepts and interaction through a 
handheld editor. A possible explanation could be that younger adults (like our test- 
participants) are very adept with handheld technology and are as familiar with 
relevant interface conventions as most people are with graphical desktop inter- 
faces. 

Because of technical difficulties only one pair of subjects experienced the adapta- 
tion effected by the agent. An interesting observation we made at that point was that 
as soon as an agent was present, the test participants were not interested anymore in 
the structure of the system itself: Whether a synapse is there or not was not an issue 
anymore. Rather, understanding how the agent learns and what model it assumes 
about the user is more important. This feature of learning software agents seems to 
address the requirement for a task oriented language to communicate with the system 
and to help simplify the whole ontology that needs to be communicated to users. 
Subjects did not worry about the existence of the agent and the fact that their actions 
were being ‘watched’ by the device, but were disconcerted that they could not be 
aware of how much the agent has learned at any moment. Also, as one subject stated, 
“...I don’t really know how much control over it and if I cannot control it I would 
be afraid to use it. If I don’t understand it I cannot control nor understand what it is 
doing...” 

A range of comments was given by the participants, particularly suggesting im- 
provements to the interface: e.g., referring to the terminology used in the interface, 
and making the workings of the system more observable and predictable and allowing 
also for task rather than structure orientated descriptions of system behaviors. 

All but one participant said they enjoyed using the system and were very effective 
in achieving their tasks. One person felt that she needed more practice to grasp the 
concepts. We note that she was one of the least positive users about the whole expe- 
rience. Invariably, users complained that not enough aspects of the physical gadgets 
were controllable by their digital manifestation: Once they had control over some 
properties they expected this to be extended to the rest. It seems an important and 
hard design challenge to convey the scope of the Gadgetworld both in terms of the 
editor interface and in terms of product design features for the e-Gadgets. 



4 Discussion 

The evaluation reported in the previous section has concluded in some provisional 
design principles, setting a direction for future work, in the domain of end-user con- 
figurable Ami environments: 

• There should be several alternative ways for the users to articulate their inten- 
tions, e.g., both task based and structure oriented descriptions. 

• End-user programming environments should offer representations of humans as 
first-class abstractions in the editing environment. 
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• An Ami environment should not surprise the user: Automation or adaptation 
actions should be visible and predictable, or at least justifiable. 

• Intelligence should be applied only to simplify complex tasks. 

• End-user programming of the user environment should be supported with similar 
tools as are offered to programmers, e.g., debuggers, object browsers, help, etc. 

This list of principles focuses on the way in which end-user configuration of Ami 
environments should be supported. In this way it is complementary to works that 
attempt to characterize interaction with perceptive environments as, for example, the 
discussion presented by Bellotti and her colleagues in [1]. 

I noted the skepticism among HCI experts regarding the ability of end-users to 
grasp the concepts we proposed. The user-tests performed in the iDorm seem to ap- 
pease the fears of an impossibly complexity, especially for new generations of users 
growing up surrounded by technology. However, scaled up user tests are required, 
both in scope and duration to gain more confidence in such a conclusion. 

This research has made several inroads in the effort to empower people to actively 
shape Ami environments. It has demonstrated the feasibility of letting end-users ar- 
chitect Ami environments, though significant advances are still needed in engineering 
enabling technology. 

From a researcher’s perspective, we have demonstrated the value of the Cognitive 
Dimensions framework, as a tool in understanding interaction with Ami environ- 
ments and we recommend its uptake in this field. Finally, the experiences reported 
suggest that an architectural approach where users act as composers of predefined 
components or by interacting with intelligent agents are two worthy and complemen- 
tary approaches. Future work should explore their combination in a scheme that lets 
users choose and develop their strategy for composing a personalized Ami environ- 
ment. 
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Abstract. We discuss how an augmented-reality platform can be used 
as a transparent interface to a windows environment. The resulting Vi- 
sual Interaction Enriched Windows (VIEWs) intend to realize an evolu- 
tionary rather than a revolutionary transition from the classical desktop 
environment to an augmented-reality (AR) environment. Therefore, in 
VIEWs, windows applications can still be controlled by standard means, 
i.e. , by using mouse and keyboard. In this way, acquired user skills with 
existing windows applications can still be exploited. The additional inter- 
action styles that are offered by the AR platform, including two-handed 
interaction, pen input with in-place visual feedback, and transparency, 
may however be used to improve specific interactions, such as sketching 
and handwriting, that are more difficult to perform on a classical desktop. 
The user is free at all times to choose the interaction style that best suits 
his/her needs when performing specific operations. The VIEWs concept 
has been implemented as part of an existing AR tool for designers. 



1 Introduction 

New developments in virtual reality (VR) and augmented reality (AR) are often 
motivated by pointing at shortcomings of the current desktop with its WIMP 
(Windows, Icons, Mouse, Pointer) interface. More specifically, it has often been 
argued that some of the best developed and most natural ways of interacting 
and communicating, such as by means of handwriting and sketching [1], two- 
handed ’’tangible” interaction [2,3], and spoken language [4], are not (or hardly) 
supported in current interfaces. Therefore, it is worthwhile to pursue new mixed- 
reality (MR) [5,6,7] interaction styles that aim at making better use of such 
well-developed human skills. The last decade has witnessed the development of 
a number of such systems (ClearBoard [8], DigitalDesk [9], BriglrtBoard [10], 
Buildlt [11], LivePaper [12]), including our own Visual Interaction Platform 
(VIP) [13]. The pace at which new mixed-reality systems are being developed 
worldwide is by now such that it is no longer possible to keep track of all of 
them. The mentioned systems are however representative for the technologies 
and applications that have been developed up to now. 

There is much potential in the use of MR systems, especially for collaborative 
work [14] and visually demanding tasks such as creating and organizing objects 



P. Markopoulos et al. (Eds.): EUSAI 2004, LNCS 3295, pp. 255—266, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 




256 



J.-B. Martens, D. Aliakseyeu, and J.-R. de Pijper 



in two (2D) and three (3D) dimensions. It is however not obvious that the 
benefits, in terms of efficiency and enjoyment, offset the increased complexity in 
the systems themselves and in the imagined usage scenarios. There are indeed 
several aspects to the existing WIMP interface that cannot be dismissed very 
easily, amongst others: 

1. The current WIMP interface to desk-top and portable computers, mostly 
with a mouse or pen and keyboard as input devices, is by now a de facto 
standard that is widely accepted and supported by many applications. Most 
people are accustomed to the WIMP interaction style and can handle it 
without having to pay much explicit attention to it. 

2. Although objections can be raised against the user-friendliness and complex- 
ity of the WIMP interface to some widely-used application programs, such 
as photoshop, word, excel, etc., people have learned to use this interface 
and will (rightfully) resist any initiative to drastically change their current 
practice. 

3. The mouse-and-keyboard interaction style may well be close to optimal for 
some user-computer interactions such as (error-free) text entry, point-wise 
interactions in a plane, drawing primitives such as circles and rectangles, etc. 

4. Although the available (mouse and keyboard) devices are not very well suited 
for performing some operations, replacing them by alternative input devices 
can introduce new and unexpected problems (for example, sketching is easier 
and more controlled with a pen on a graphical tablet than with a mouse, 
while the situation is reversed for other actions such as double-clicking) . The 
aspects of generality and extensibility [15], i.e., of finding the right compro- 
mise between the specific demands of individual tasks and the diversity of 
input devices and interaction styles offered, definitely need to be taken into 
account here. New interaction devices should probably only be introduced if 
they can provide an advantage for a range of relevant tasks. 

5. Many man-years have been invested in creating windows programs for di- 
verse applications such as office work, communication, scientific research, 
art, leisure, etc.; the man-power, time, knowledge and motivation required 
for adjusting these applications to alternative interaction styles is simply not 
available. We hence need to create access to or easy communication with such 
applications from within newly developed platforms. 

6. With the advent of laptops, pocket and notebook PCs, mobile phones and 
wireless networks, WIMP-based systems have become very portable (to be 
used any time/ any where). Although visions of ubiquitous computing [16,17] 
present a similar future for augmented-reality systems, current systems have 
very visible technology (such as helm-mounted displays, see-through glasses, 
large projectors), so that we are still far removed from this vision. Some 
of this technology constitutes an obvious obstacle to natural human-human 
communication, which makes it unacceptable in most social contexts. 

In summary, we can safely state that it is unlikely that people will simply aban- 
don the advantages of familiar tools for the more speculative opportunities of- 
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fered by MR systems, especially if such systems are more expensive and less 
portable. 

Based on the above arguments, it seems that we should strive to combine the 
strengths of the current WIMP interface with newly proposed MR interaction 
styles [18], rather than trying to completely replace this desktop (metaphor). 
In this way, we can try to avoid that people have to make drastic changes in 
their current work practice, including the use of window applications, while at 
the same time creating access to new MR interaction styles. This approach is 
not only expected to ease the acceptance of new techniques, but also allows to 
study in a more structured way the added value and preferred use of these new 
interaction styles. We expect that different users will mix existing practices with 
the newly offered MR interaction styles in different ways, and that new combined 
interaction styles will evolve over time. In order to study this, we obviously 
require prototype systems that incorporate this idea. The development of such 
a prototype system, that is intended to be used for future experimental studies, 
is described in this paper. 

The remainder of this paper is organized as follows. In section 2, we describe 
the Visual Interaction Platform (VIP) that we use as our platform for develop- 
ing new AR applications. This section provides insight into the AR interaction 
techniques that can be offered in addition to the mouse&keyboard inputs. In sec- 
tion 3, we discuss how Visual Interaction Enriched Window (VIEW) applications 
can be implemented on this VIP using a transparent interface that acts as a filter 
on the input to and output from windows applications. More specifically, user 
actions (amongst others with a pen on a graphical tablet) are mapped to mouse 
and/or keyboard actions for a windows application, while the visual output from 
this application is mixed with visual output from the transparent interface itself. 
We also describe how we have integrated VIEWs in the Electronic Paper (EP) 
prototype, an existing computer-support tool for early graphical design [19]. 

2 Visual Interaction Platform 

2.1 Hardware 

Three Visual Interaction Platforms (VIPs) [13] that differ in implementation 
details are currently available at our laboratory, two of which are shown in 
Figure 1. The VIP design was inspired by earlier designs such as the DigitalDesk 
[9] and the Build-It system [11]. A video projector is used to create a large 
workspace on the horizontal surface of a graphical tablet, more specifically, an 
UltraPad A2 tablet from Wacom. In one VIP system, shown in the left part of 
Figure 1, the image is projected directly onto the tablet, while in an alternative 
VIP system, shown in the right part of Figure 1, the image is projected via a 
mirror, which allows to make the system more compact. 

The user cannot only use the traditional keyboard and mouse for interaction 
into the horizontal workspace, but may also perform handwriting, drawing and 
sketching actions by means of a pen on the graphical tablet and/or can move 
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Fig. 1 . Two different prototypes of the Visual interaction Platform (VIP). 



tangible objects across the tablet. These graspable objects [2], from now on 
mostly referred to as brick elements (BELs), can control virtual objects and can 
be operated across the entire workspace. They are coated with retro-reflecting 
material and are illuminated by an infrared light source located above the ta- 
ble. A camera located next to the infrared light source tracks the movements 
of these interaction elements. The user can interact within specially designed 
applications, such as the Electronic Paper (EP) tool discussed in section 3, by 
modifying the locations and orientations of these BELs. The infrared lighting 
and coating of the BELs is essential in order to allow for a real-time and robust 
tracking. 

Unlike in the current desktop environment, where the mouse actions and the 
cursor movements occur at separate positions, visual feedback in the horizontal 
workspace occurs at the positions indicated by the BELs and/or the pen. Because 
of this close coupling between user actions and visual feedback by the system, 
the resulting horizontal workspace with its AR capabilities is referred to as the 
action-perception space. 

Apart from this action-perception workspace, the VIP has a second verti- 
cally oriented workspace that is referred to as the communication space. This 
second workspace is only accessible with traditional keyboard-and- mouse input, 
and can be used to display a standard windows environment (for office work), 
to communicate with remote participants (in a video conference application), 
to supply extensive visual feedback (in case of visualization), etc. [13]. The im- 
ages created in the action-perception-space and in the communication space are 
provided by the primary and secondary outputs of a dual-screen video board. 
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2.2 Software 

The output image (display) rendering required by the application programs is 
performed using OpenGL [20]. Most input devices of the VIP, i.e., the keyboard, 
mouse and pen, can be operated using available drivers, such as the Wintab 
library for pen input (see http://www.pointing.com). In order to simplify the 
programming and testing of applications that involve video-based interaction 
(i.e., the tracking of the BELs on the table), a software library has been developed 
in-house. This library allows easy access, processing and storage of images, and 
hides the characteristics of the specific frame-grabber board used to capture the 
images from the camera. The mappings between the three coordinate systems 
involved, i.e., those of the camera, the projector and the tablet, are also handled 
by this library. Good registration between these coordinate systems is required 
in order to guarantee that the visual feedback provided by the projector occurs 
at the actual BEL or pen positions. 

2.3 Motivation and Properties 

Preserving the characteristics of traditional media, while augmenting them with 
access to new functionality is the design philosophy underlying augmented reality 
[21]. One of the early AR systems, called the DigitalDesk [22,9], was for instance 
designed for the purpose of combining real paper and electronic documents in 
augmented reality. The system recognized inputs from pen or finger. It used a 
video camera to capture paper documents placed on the desk, and responded to 
interactions by projecting electronic images down onto the desk. The interactions 
were mostly ’select’ and ’cut and paste’ operations. The VIP is very much related 
to this DigitalDesk, and has been designed as a development platform for AR 
applications. Especially the ability to create augmented paper [23,24,25] has 
had a strong influence on the development of the VIP, and we therefore discuss 
the motivation and implementation of this concept in somewhat more detail. 

People are taught drawing and sketching, alongside with speech, from a very 
early age. Writing enters life a little bit later. Most of us cannot even recall 
ourselves without these skills. From this point of view, pen input is as natural 
as speech and should be an important input modality in any Natural User Inter- 
face (NUI) [4], Pen and paper are also traditional companions in many creative 
activities. It is generally accepted that drawing or sketching supports thinking, 
recollection of earlier ideas, making associations, etc. and is hence very valu- 
able in developing and shaping ideas [26]. Writing has mostly been considered 
as a (fairly unreliable) alternative for the keyboard in computer interfaces, and 
its use is mostly restricted to palm-top and notebook computers. It is only re- 
cently that the importance of providing good sketching interfaces starts to be 
recognized [27,28]. 

Rotation and translation of the paper while drawing or sketching are very 
natural and almost subconscious actions. Restricting the user in the possibility 
to accomplish them can potentially change his/her attitude to a pen-and-paper- 
based system [29] . It is considered important to respect this freedom for handling 
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the paper on which one is writing. Video images are therefore used within the 
VIP to record the size, position and orientation of the paper. In order to realize 
this, papers are either tagged with infrared reflecting tape or mounted on an 
infrared reflecting tablet. By combining several technologies, the VIP system is 
hence able to capture pen and paper movements with little disturbance to the 
user. This being accomplished, the real piece of paper can be augmented by 
adding virtual information by means of video projection. 

We conclude by summarizing the most important isolated characteristics of 
the action-perception space of the VIP: 

1. the action and perception spaces coincide, i.e., visual feedback is provided 
at the position where the action is performed; 

2. two-handed interaction is possible, which for instance allows to move an 
object such as a piece of paper with the non-dominant hand while writing 
or selecting items with the dominant hand; 

3. multiple users can collectively interact at the same time, using separate inter- 
action elements, thereby promoting group work (with one notable exception, 
i.e., the Wacom tablet allows for only one pen to be used at the same time); 

4. easy-to-learn interaction style that requires little or no computer skills, i.e., 
only object and pen movements; 

5. the users do not have to wear intrusive devices like head-mounted displays 
that are likely to interfere with their social interaction (the only exception is 
the use of polarized glasses in case of stereo projection in the communication 
space, which is sometimes used to examine 3D models); 

6. there are no messy wires to hinder user movements. 

3 VIEWs Within the Electronic Paper Prototype 

We now discuss how the concept of VIEWs has been implemented on the VIP. 
The (vertical) communication space of the VIP is used to display the standard 
windows environment. Any window application can be started in the usual way 
and viewed in the communication space, while being controlled by the mouse 
and keyboard. The transparent interface program that implements the VIEWs 
is started in a window that occupies the (horizontal) action-perception space. 
This latter program can be instructed to find (the software handle to) a specific 
active window application (such as MS Paint) or can itself start a desired window 
application. It can also monitor the visual output of this application and show a 
copy of it as the so-called VIEW in the action-perception space, as illustrated in 
Figure 2. We start by describing the original Electronic Paper (EP) prototype, 
and subsequently indicate how the VIEWs concept has been integrated in it. 

3.1 Electronic Paper 

The EP was conceived as a user interface for early architectural design. It aims 
at supporting the freedom, flexibility, abstraction, speed and ease of use of tradi- 
tional pen and paper, while meantime providing access to computer functionality. 
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Fig. 2. The Electronic Paper (EP) interface in the action-perception space is shown at 
the bottom with its main elements: the image database browser (1,2,3), virtual papers 
(4), the function menu (5), the enhanced paper prop (6), pins (7) and clips (8). It also 
contains a VIEW (9). The green rectangles are the (animated) visual feedbacks at the 
positions of detected BELs. The communication space at the top contains the actual 
windows applications (in casu, Corel Painter 8) that corresponds to the VIEW. 



The action-perception space of the EP prototype is shown in the lower part of 
Figure 2. It contains an image database browser (IDB), Virtual Papers (VPs), a 
function menu (FM), an Enhanced Paper Prop (EnPP), virtual clips and pins. 

The IDB is located in the left margin of the action-perception space and 
contains empty pages (of different colors and sizes, and with different raster 
patterns) and images that the user has acquired previously (either through this 
EP interaction tool or through other means). The browser contains an image 
database selector (1), a preview on a subset of thumbnail images in the selected 
database (2), with two buttons at the bottom and two scroll bars on the sides 
for scrolling through the database, and a preview window that shows a high- 
resolution version of the currently selected image (3). The digital pen can be 
used to operate the scroll bars on the sides and to adjust the size of the IDB, 
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so that more or less thumbnail images can be viewed simultaneously. A BEL 
positioned on a scrolling arrow or moved along a scroll bar can also be used 
to move the preview selection across the database. Visual feedback is always 
provided on the surface of a BEL. The feedback depends on the position of the 
BEL and suggests the actions that are afforded by the BEL. For example, if the 
BEL is positioned on a thumbnail, then the animated arrow shows that the user 
can now drag the image outside of the browse menu. A copy of any image in 
the browser can be selected and dragged (with the BEL) or picked and dropped 
(with the digital pen) [30] into the working area, hence creating a VP. 

By moving the BEL, preferably with the non-dominant hand, the user can 
simultaneously change the position and orientation of a VP. When precision is 
required and the dominant hand is available, the digital pen can also be used 
for fine-tuning the position and orientation. The pen can however only modify 
one attribute at a time. The digital pen is also used to annotate or sketch on a 
VP. These annotations or sketches can be saved or printed for future use. A VP 
has several properties like transparency level, size, sketching ink color and pen 
thickness. To adjust these properties the user can use the movable FM. When 
any of the four corners of the FM is moved within the boundaries of a VP, then 
the FM operates on that VP, thereby enabling the user to change its properties. 
Menu selections and parameter adjustments are performed with the digital pen. 

The Enhanced Paper Prop (EnPP) is a sheet of real paper with infrared 
tagging (that can exist in different sizes). The EnPP can be attached to a VP 
and a real or digital pen, with or without ink cartridge, can be used to draw on 
it. 

The EP prototype supports additional tools such as clips (for merging dif- 
ferent VPs or merging VPs with the EnPP), pins (for attaching a VP to the 
workspace), an eraser (to erase part of a sketch) and grids (grids can be pro- 
jected on top of a VP or on the EnPP to support drawing in perspective or 
isometric mode for instance). 

3.2 VIEWs 

The EP prototype has recently been extended to include the VIEWs concept. A 
VIEW has the same properties as a VP, but instead of an image or sketch the 
VIEW contains a windows application. A VIEW can be created in the same way 
as a VP using the IDB. In this IDB windows applications are represented by 
appropriate icons. When the user creates a VIEW, the corresponding windows 
application is started and positioned within the communication space. Currently, 
the prototype supports up to three VIEWs simultaneously. A VIEW is dynam- 
ically updated by monitoring the video output from the windows application. 

The AR capabilities of the EP allow for several additional interactions with 
the windows application. We present a number of these augmented interaction 
styles, some of which are already available in our current prototype, and some 
of which still remain to be implemented. The list is obviously non-exhaustive. 

First, the EP tool captures all pen events that occur in the VIEW, i.e. , 
the area occupied by the image of the windows application, and maps them to 
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mouse events at the corresponding position in the original windows application. 
These mouse events will usually influence the visual appearance of the windows 
application. These visual changes are however also reflected in the VIEW that is 
presented in the EP. In this way, the user not only gets the impression that the 
pen is controlling the windows application, but also receives the visual feedback 
at the position where the pen is located. This visual feedback at the position 
where the action is performed makes many operations, especially sketching and 
writing, easier and more natural. Since all operations in the action-perception 
space occur in a horizontal plane, physical aids such as rulers and curve guides 
can be used very easily to assist in making pen drawings. We have noticed 
that, for some operations such as typing, looking at the original window on the 
vertical screen is often preferred. This is one of the motivations for maintaining 
both views. This task-dependent preference for a horizontal or vertical workspace 
warrants further experimental examination. 

Second, a tangible interaction tool such as a BEL or the EnPP can be used 
in the non-dominant hand to control the position and orientation of the VIEW, 
while interacting with the pen in the dominant hand. A more natural interaction, 
that closely resembles writing or sketching with a real pen on paper, can be 
accomplished in this way. 

Third, a VP or a sheet of real paper can be positioned to coincide with the 
VIEW. The user can write and draw on such a (real or virtual) sheet, using the 
image from the windows application (such as the page of an interesting website) 
as background. This may for instance assist in overdrawing parts of an existing 
document. A VP may also capture an outlined portion of the VIEW, which 
may for instance be useful for creating collages of input material, in a way that 
resembles cutting and gluing in the real world. The VIEW may also be made 
transparent and overlaid on an existing virtual (or real) paper, so that this latter 
paper can be used as the background for the windows application. Aspects of this 
background image may be overdrawn or ’’captured” by the windows application. 
Several different VIEWs can be combined in a similar way. 

Note that the above interaction styles are generic, in the sense that they do 
not require any knowledge about the windows application that is tied to the 
VIEW, i.e., this windows application is a ’’black box” for the transparent in- 
terface. This is the case when only mouse events and single-key entries need to 
be communicated to the windows application. Therefore, it was fairly easy to 
incorporate these interaction styles in the EP prototype. In case more advanced 
control over a windows application is pursued, then specific knowledge about 
this application will often be required. We illustrate this situation by means of 
an example. Different kinds of transparent menus called toolglasses [31,32,1], of 
which the FM is one example, can exist in our transparent interface. A toolglass 
can be moved (by means of a BEL in the non-dominant hand) in order to al- 
low the menu items in such a toolglass to be selected with the pen. Different 
toolglasses may also correspond to different sides of a single tangible interaction 
element [33]. The selection of a menu item in such a toolglass may be used to 
trigger a sequence of mouse (and/or keyboard) events for the windows appli- 
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cation. For instance, moving a toolglass with colored squares and clicking on a 
red square when it is on top of a rectangle in an MS Paint application may for 
instance result in a sequence such as selecting the rectangle, changing the pen 
color to red, performing a fill operation, and resetting the pen color to its original 
color. Obviously, although such toolglass operations are potentially very pow- 
erful, they also require specific knowledge of the windows application on which 
they are operating. Fortunately, recent software developments such as the use of 
C # .NET to automate (i.e., control from within your own program) instances of 
(office) programs (see http://support.microsoft.com) bring such scenarios within 
the reach of application programmers. 

One example of a toolglass operation that may for instance be easily imple- 
mented for a large number of applications is to virtually attach (and detach) 
the toolglass to the canvas in order to perform scrolling of the canvas with the 
non-dominant hand (instead of using the sliders on the sides of the canvas) [1], 
Another example is a tool for creating spaces or other frequently used items 
(such as figures and tables) in a document. These spaces may subsequently be 
filled with pictures or drawings created by one of the tools discussed above. Such 
’’enhanced” interaction styles are currently not implemented in our prototype 
and need to be explored further. 

None of the above interaction styles assumes any pattern recognition capabil- 
ities from the side of the system. Obviously, if the system is also able to perform 
handwriting (or more generally, symbol or gesture) recognition, then many more 
applications can easily be imagined. One example would be entering numbers 
into a calculator [9] or spreadsheet application by means of handwriting. This 
could be realized by generating keyboard events from handwriting actions (for 
instance occurring in a pre-specifiecl area of the transparent interface) . 



4 Discussion and Future Work 

In this paper, we have demonstrated how the Visual Interaction Platform can be 
used to create an augmented reality interface to traditional windows applications. 
The resulting visual interaction enriched (or augmented) window applications 
are called VIEWs. A carrier application, called the Electronic Paper, has been 
adapted in order to allow more detailed exploration of the VIEW concept. 

The system is now sufficiently developed to start performing more formal 
usability evaluations, through which we can explore in more detail the advantages 
and the drawbacks of the VIEWs concept. In the paper up to now we have 
stressed the increased functionality that can be created and its potential use. 
This however comes at the price of an increased complexity in the interface, 
since the user is now expected to control two work-spaces, the action-perception 
space and the communication space, rather than a single desktop. Issues like how 
information can be passed between the work-spaces (could be through VIEWs 
or otherwise), which tasks map most naturally to which work-space, individual 
preferences and their impact on the interface, etc., need to be resolved. Such 
issues can not easily be addressed through traditional psychophysical studies, 
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where subjects are only exposed once to a new environment. The possibilities of 
the new AR techniques have to be experienced, and the user should be allowed 
to evolve his/her preferred hybrid interaction style, before a fair comparison 
with existing interaction techniques can be made. We have therefore recently 
started a project in which a more longitudinal study with a specific user group, 
i.e., industrial designers, will be undertaken. These users have been selected 
because of their easy availability within our department, and because, unlike 
more traditional office workers, they use a mixture of very diverse techniques 
(and programs) to accomplish their work. 
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Abstract. We put forth a proposal for a system that creates an augmented social 
environment where devices allow for an integrated capture of information about 
a social gathering (e.g. pictures, audio recordings, written thoughts and 
expression). Users can subsequently navigate around a collection of images, 
while listening to sounds and conversations recorded with the photo at the time 
of capture. We describe the devices used to capture this information and the 
user experience of the integrated Memory Collage application. The paper will 
also discuss the implications found for ambient intelligence within social 
environments. 



1 Introduction 

Our project’s primary aim is the exploration of audiophotography and other methods 
of capturing information in a social environment. Audiophotography [1] is a domain 
that studies the value and practice of recording sound with still photographs. 
Motivated by the research of Frohlich and Tallyn we are trying to explore the full 
affordances of the audio capabilities found on many digital cameras today. 

Although audio provides an additional dimension to memory recall on top of 
viewing photos alone [1], the ways in which sounds most relevant to the viewer are 
recorded has not been explored. Video recording can be argued as a method that 
incorporates all relevant sound and image together, but they are not selective in the 
information they capture, making indexed retrieval difficult at a later date. To ensure 
a high probability of capturing sound and images of importance, a system needs to 
determine the relevant events in its environment. 

We are developing applications for the capture of events in one’s social 
environment, and then presenting the audio enriched photos to the user in ways that 
evoke memories and emotions. This paper will detail our progress to date in 
augmenting a social gathering using microphones, cameras and sensors to obtain a 
more omnipresent view for sharing and personal viewing. We will then discuss the 
issues revealed throughout the project pertaining to ambient intelligence in social 
settings. 
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2 Related Work 

The last few years have seen the development of a number of computational systems 
that capture information within their environment. Sumi, Matsuguchi, Ito, Fels, and 
Mase [2] developed a system, which integrates wearable and embedded sensors, 
video, and microphones, to observe interactions amongst multiple users. The 
resulting information is captured on an individual basis and does not allow for an 
omnipresent view of events. The Eavesdropper camera [3] automatically captures 
spontaneous moments in pictures when triggered by sounds such as laughter, voices, 
or noises. Frohlich and Tallyn [1] suggest that ambient “sounds-of-the-moment” 
convey a richer memory when paired with their corresponding photographs. We see 
an opportunity in developing a system that actively seeks to provide a novel means 
for recording important moments and multiple perspectives of a party. 



3 Memory Capture and Display 



An informal user study was conducted at a party organised by the authors to observe 
and record social interaction by participants within an instrumented environment. 
The location was a lounge on the campus of the University of British Columbia. The 
twenty-one participants consisted of professors, researchers, and graduate students. 
Questionnaires were distributed for initial feedback regarding their thoughts on the 
devices in the environment. We asked specifically about privacy issues and the 
obtrusiveness of the devices. Various sound recorders were placed in strategic 
locations to record ambient noise and coherent conversations near high traffic areas 
(e.g. foosball table, food table). Digital cameras carrying digital voice recorders (Fig. 
1) captured snapshots of the event with the idea that recorded audio would likely 
correlate well with a photo. Touch sensors (Fig. 2) provided participants an 
opportunity to note significant moments through a time-stamping mechanism. These 
time-stamped events and photos served as a temporal index into the audio recordings 
so that meaningful audio clips could be played during photo viewing. 




Fig. 1. Digital camera with digital voice 
recorder mount. 




Fig 2. Decorative event captur- 
ing touch sensor. 



Using the information collected at the testing party, an application which we call a 
“Memory Collage” (Fig. 3) was created to elicit responses as to the usefulness of the 
sound and photo combination in evoking emotional memories of an event. The 
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Memory Collage application enables users to spatially manipulate images and arrange 
them in a personalised manner, unlike the structured organisation common to photo- 
sharing applications and browsers [4]. Users were able to create meaningful 
arrangements of their choosing, as well as play back a clip of sound recorded when a 
photo was taken. 

Preliminary feedback indicates that this method of event capture and viewing 
allowed participants to recall situations they ordinarily would have had difficulty 
remembering. This phenomenon was particular apparent when an audio clip captured 
spontaneous remarks or comments that are normally quickly forgotten after an event. 
The Memory Collage demonstrated that there is a significantly stronger memory 
recall of events from audio-photographs. 




Fig. 3. Photos placed in personalised arrangement 



4 Conclusions and Future Directions 

Our project provides a first step into the research of information capturing devices in 
social environments, which we view as a new and relevant area of interest for ambient 
intelligence. The greatest challenge may involve using devices that rely on 
assumptions or fixed variables in a social environment. For example, we were not 
able to place recording devices in ideal locations, and we were unsure whether 
participants were using touch sensors for the intended purpose. These problems were 
encountered because of the inherent unpredictability of the environment. A system 
designed to be used in a shared, social space should be able to account for this 
unpredictability. 

Our future directions are two-fold. First we wish to investigate more salient 
methods of capturing sound. Some participants found that sounds were not always 
correlated to the photos they interacted with, producing a degree of confusion. This 
may be because the affordance of the system suggests that sounds should be very 
closely correlated to the focus of the photo. A possible solution could be the use of a 
Local Positioning System such as RemoteEyes [5] and equipping each participant 
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with a personal microphone. Sound clips could then be chosen by locating who is in 
each photo. Secondly, we wish to develop a framework for the use of intelligent 
devices in a social environment. This framework should detail how to build a flexible 
system that can be fitted to a specific environment, depending on its variable and 
fixed attributes, such as the event space, the number of people, or the type of event. 

We present these ideas to provide a catalyst to the research of ambient intelligence 
in social environments. By increasing these efforts, we hope to provide further 
insight into the relationship between people and their environment when interacting 
with intelligent systems. 
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Abstract. The topic of this paper is rich interaction. Rich interaction borrows 
from tangible interaction and the concept of affordances. This is achieved 
through integral design of form, interaction and function of products. It is 
applied to interactive consumer products. A digital camera with a rich user 
interface (RUI) was designed and compared in a user study to a digital camera 
with a more conventional user interface. Several issues concerning rich interfaces 
are discussed. 



1 Introduction 

Tangible interaction is a hot issue today. Coined in 1997 by Ishii & Ullmer [1] their 
term comprises user system interaction by means of physical representations of digital 
data. Ullmer argues that by means of tangible interfacing user-system interaction can 
be made more natural in that it fits human skills [2], The examples commonly given 
of tangible interaction include computer supported cooperative work (CSCW) and 
computer supported tools [3] [4], Although the first well-known example of tangible 
interaction, the marble answering machine of Bishop [5], explored an alternative 
interaction style with a consumer product, 
the relevance of tangible interaction seems 
to be somewhat forgotten. 

At our department of Industrial Design 
in Eindhoven research is conducted to 
intelligent products, in particular 
interactive consumer products. Inspired by 
examples of tangible user interfaces we 
envision those products to have what we 
call rich user interfaces (RUIs). Rich user 
interfaces borrow from tangible user 
interfacing techniques and from the 
concept of affordances [6]. Key to rich 
interfacing is the notion that form, 
interaction and function are strongly related 
to each other, see figure 1 . Form invites to 
interact and in this interaction functionality 




Fig. 1 . Circle model of properties 
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is opened, preferably in a beautiful way. In order to design a product with a rich user 
interface those three properties should be designed simultaneously. We see the tight 
integration of form, interaction and function as a challenge particularly suited for 
industrial designers. Increasing complexity of interactive products might be made 
easier to grasp using rich interfacing techniques. For rich interfaces appeal to the 
perceptual-motor skills of people [7] [8] and they allow for more sense modalities to 
express information on use. This new interfacing approach will lead to new products. 
Which, as a consequence, will lead to more variety in interactive products. 

The digital camera was deemed a good example of an interface that degenerated in 
the process of going from analog to digital. To demonstrate the power of integrating 
form, interaction and function, a design for a digital camera with a rich interface was 
made. In this paper we first show and discuss the rich interface of this digital camera. 
Then we present a user study in which this digital camera was compared to a digital 
camera with a more conventional interface on terms of aesthetics of appearance, 
aesthetics of interaction and ease of use. And finally we discuss our results. 

2 Example: Digital Camera with Rich Interface 

A design was made for a digital camera with a rich interface. The starting point for 
the design was a technical description of the functionality. It was decided to focus on 
the core functionality of a digital camera. It has the following feature -list. 

1 . switch on/off 2. shoot a photo 

3 reject a photo 4. store a photo 

5. review/play photos 6. control size (pixels) of photo 

7. zoom in/zoom out 

The user-actions drove the design. Several pre-models were made to explore form, 
interaction and function. Step by step combinations of function and interaction were 
researched, put into form and tested out. The design process was an iterative process. 
In testing and changing pre-models the opening of functionality through form and 
interactivity was assessed. The result of this process was a cardboard mock-up of the 
camera that offers action possibilities, see figure 2. 

Functionality is expressed solely in the form and in the interaction with the form of 
the camera, and not in screen based user interface. Although the camera does have a 
screen this screen is only used to display pictures, it is not used to navigate through 
menu s. The controls of the camera not only express what you can do with them, they 
also express what will happen when you use them [9]. For example, the trigger 
expresses that it can be pushed. It also shows that it restrains the screen in the closed 
position. The screen has two possible positions, it can align with the lens and it can 
align with a trajectory towards the memory card. In this way we try to convey the 
message that when the trigger is pushed the screen will flip in the other position, thus 
capturing an image. 
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When the lenscap (a) is taken off, the camera switches on and displays the image on the screen at 
the backside of the camera. The pixelsize of the photos can be set (e.g. 2560x1920 or 1600x1200) 
by changing the size of the screen with physical ‘scalers’ (b). The removable memorycard is 
always visible (c). 

At the sides of the lens two small handles are placed. When the handles are pulled (d) the lens 
comes out of the body and one can zoom in on the object of interest. When the composition seems 
good, the trigger can be pushed (e) to capture the image. The screen will flip away from the lens 
by means of a spring (screen open position (f)) and one is given the opportunity to review the 
photo. It now can either be saved or deleted. 

When the photo is satisfactory it is saved by moving the screen towards the memory card (g). 
The photo will ‘flow’ from the screen into the card, the screen blanks. The screen is spring loaded 
and will return to the screen open position when released, it can then be clicked back against the 
lens and a new picture can be made. If however the photo is not satisfactory the screen is just 
clicked back (h) against the lens, the image is not saved and disappears, a new picture can be 
taken. 

If the screen is held against the memory card, it clicks into place and it will start to display the 
images that were stored in the memory card. Those images can be browsed using a small lever 
(i) that is exposed when the screen is moved towards the memory card. 



Fig. 2. Camera with a RUI 
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3 User Study 



We conducted a user study to investigate how the camera's rich user interface compared 
to a camera with a conventional user interface in terms of usability and beauty. These 
latter notions are central to human experience and thus to industrial design. 




Fig. 3. Stimuli, camera with a rich interface and camera with a conventional interface (simu- 
lated screen content shown right bottom) 

3.1 Stimuli 

In the study, two mock-ups of digital cameras were compared. One was the camera with 
the RUI described earlier, the other was a mock-up of a camera with a more 
conventional interface. It was decided to use an existing camera (Pentax Optio S) as 
an example of a conventionally interfaced camera, however, it was scaled 125 percent 
in size so that it was comparable in size to the RUI-camera. To ensure consistence in 
outlook and feel the two mock-ups were both made out of cardboard. Both models had 
limited action possibilities. The RUI-camera allowed for moving its control and 
feedback elements, but lacked the spring loaded reactions. The conventional camera 
had glued on ‘buttons’ and a simulated screen content and though the buttons could be 
touched they did not respond to pressing, see figure 3. The cameras did not 
autonomously provide feedback since they were built out of cardboard. 



3.2 Procedure 

The study took place in a room where 
the surveyor and participant were 
sitting at a table. At the end of the 
table a dummy object was present as a 
point of interest for the camera to be 
pointed at. The whole study was 
recorded on videotape. See figure 4. 




Fig. 4. Setup of the user study 
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Ten subjects of different backgrounds, age (13-59 years old) and gender (2 female, 
8 male) participated. The study existed of three parts. The first camera was evaluated, 
then the second camera was evaluated and finally the subjects were asked which camera 
they liked best. Alternatingly, the RUI-camera or the conventional camera was shown 
first. Only in the third part of the study both cameras were present on the table. 

The first and the second part of the study were of similar setup. The participant was 
first asked to spontaneously speculate on the operation of the cameras. Then a series 
of small assignments was given in which the participant was walked through the 
functionality of the cameras (switch on, zoom in, take a picture, save picture, do not 
save picture, play saved pictures, set the resolution of the camera to small). During 
those assignments the surveyor provided feedback on actions of the participants by 
manually moving parts of the mock-ups when necessary. After this he was asked if he 
liked the camera and the way it could be used to make photos. If the participant had a 
different way of using the camera than we had in mind, the intended workings were 
explained. In that case he was again asked what he thought of the camera. Part one and 
part two of the study were concluded by asking the participant if he missed functions 
and if he had any remarks on the cameras. 

The third part of the study consisted of a comparison of the two cameras on three 
aspects: the aesthetics of form, the aesthetics of use and the ease of use. 

Finally the subject was asked to fill in a short questionnaire (gender, age and 
occupation). 



3.3 Findings 

The first author analyzed the almost five hours of video-tape that resulted from the 
experiment. 

Part 1&2: In part 1 and 2 of the user study the two cameras were assessed separately. 
When asked to speculate on the workings of the cameras the participants were better in 
explaining how the conventional camera was operated, see figure 5. They were not sure 
what to do with the RUI-camera, they kept trying to find buttons, especially a menu- 
button, and were frustrated that no labels were present. When asked to complete a series 
of small assignments the conventional camera again did better, the results were more 
nuanced however, see figure 6. 

Tium her of participates 




Fig. 5. Speculation on the workings of the two cameras - did the participant succeed in 
fulfilling the assignment: yes, more or less (~), no 
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camera with conventional user interlace 
camera with RU1 




Fig. 6. Walkthrough of limited set of functionality - did the participant succeed in fulfilling the 
assignment: yes, more or less (~), no 



Part 3. In part 3 of the user study the participants were asked to give preference for 
aesthetics of form, aesthetics of use and ease of use for the cameras, see figure 7. At 
this time the workings of both cameras were thoroughly explained. When asked for a 
preference for form, most participants preferred the RUI-camera over the 
conventional camera, figure 7, row 1 . The reason that was given most was that they 
liked it better because the form was more outspoken (6 times). When asked to make a 
choice for a camera on aesthetics of use and ease of use there was less consensus. 
Several participants were unable to make a choice, figure 7, row 2&3. Four participants 
claimed that they would use the RUI-camera in different situations as the conventional 
camera; the conventional camera for holiday pictures, the RUI-camera for more 
professional use. One person found it difficult to differentiate between aesthetics of 
use and ease of use. 




I ~l camera with conventional user interface 



camera with Kill 



Il'MI 111! 1 I no choice 



Fig. 7. Preferences of the participants 



4 Discussion 

Of course when comparing just two cameras no general conclusions can be made on 
interfacing techniques. Still a lot was learned from the study: on the study itself and 
on interfacing techniques in general. 




Rich Interaction: Issues 



277 



4.1 On the Study 

Bias caused by lack of interactivity: The proper operation of the RUI-camera 
depends on the interplay between feedforward (expression of form and movement) 
and feedback (camera reactions). In contrast, the proper operation of the conventional 
camera depends on labels on its controls and to a lesser degree on feedback. The mock- 
ups, however, did not provide autonomous feedback while the subject of investigation 
was interaction techniques. This resulted in an advantage for the conventional camera 
in the user study. 

The conventional camera is designed with labels on its buttons that literally tell a 
user what will happen if those buttons are pressed. The RUI-camera is designed to 
expresses in its form and in its movements what can be done with it (action possibilities/ 
affordances), what the camera will do (functionality) and how the form of the camera 
will change as a result of that action to express new action possibilities. Therefore it is 
much more important for the RUI-camera that it actually reacts on users actions then 
it is for the conventional camera. 

No naive users: There seem to be no naive users around anymore when it comes to 
cameras (analog or digital) or to on-screen graphical user interfaces. The participants 
are used to the fact that electronics can only be operated with buttons. They were 
quite unwilling to directly manipulate the parts of the RUI-camera. 

4.2 On Interfacing Techniques in General 

On rich interaction: The workings of the RUI-camera are action driven. Form and 
action possibilities were designed simultaneously. We intended the camera to be a 
physical reflection of the story of the action possibilities it offers. However, in the study 
we found that it was not always clear to the participants how to operate the camera. 
We suspect this is because the story that is told by the camera occasionally is a very 
technical one. To give an example, the screen, and thus the picture, should be moved 
towards the memory card to save the picture on that memory card. This is a reflection of 
the process that is at work in the insides of the camera. This is not necessarily the same 
as the mental model of the user on the workings of the camera. This is a problem, it is 
relatively easy to point out where a design fails, but it is very hard to get it right. 

Opening up functionality is crucial for interactive products. All too often the 
technical functionality drives the interaction with a product. The argument being that 
since functionality is what defines a product, functionality should be delivered to the 
user in its purest form. Rich interaction, however, goes deeper than interaction alone. 
Rich interfaces are designed to exploit the expressiveness of form to invite the user to 
explore what can be done with a product. Functionality is an intangible thing, only 
through physical form the user is able to reach functionality, and only by designing this 
form for interaction the functionality is opened up for use. 

Does rich interfacing make sense? During the user study the participants kept trying 
to find buttons to apply functions to when exploring the camera with the rich 
interface. The camera with the rich interface had few extremities that could be 
perceived as buttons, and still functionality was assigned to diverse spots on the 
camera, seemingly at random. Why was this? Earlier we already speculated that naive 
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users do not exist anymore. People expect buttons, they know conventional interfaces 
and they seem to expect those kind of interfaces. Presumably because there are a lot 
of those interfaces on the market today. 

The important question of course is if we should leave it at that. For people seem to 
cope and rich interfaces are hard to design. Do they want the linearity and the 
sequentiality of conventional interfaces or do they want the expressive and dynamic 
interaction of rich interfaces? We do not want to make this choice for people. But 
what we do recognize is that people increasingly have problems with the featurism of 
conventional interfaces. 

5 Concluding 

• To assess its value, a RUl-prototype has to be a working prototype. That is, it has 
to react to users actions in the way intended by its designer. 

• A follow-up experiment with the same participants is considered to investigate if 
RUIs are remembered better than conventional interfaces. 

• It is hard to design rich interfaces for in the early stages of the design process 
feedback and feedforward [9] are missing because of the lack of interactive mock- 
ups. 

• In retrospect we find that our RUI-camera is designed with too much emphasis on 
technical functionality. We think this can be remedied by exploring not only form 
and interaction but also functionality when designing future RUIs. 

• On aesthetics of appearance. The participants liked the RUI-camera - design 
makes a difference. 

• On aesthetics of use, the jury is still out... 
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Abstract. The concept of metaphor has been used in UI design in a loose 
manner. There is a need to conceptually separate it from related concepts to 
restore the power it used to have in rhetoric. It is also important to understand 
the life cycle of metaphor, how it changes over time in the conceptualisation 
process. This is especially topical in ubiquitous computing, in which entirely 
new concepts and interaction styles are introduced. In this paper, we describe 
the use of metaphors and related concepts in theory and apply the approach in a 
mobile application. 



1 Introduction 

The commercial success of the graphical user-interface (GUI) can hardly be seen as 
the result of its superior usability over the command line interface. Actually, 
especially at the outset of the triumphal march of the GUI, there were a number of 
suspicions about it [9, 11, 19]. We can still argue endlessly for and against the 
usability of GUIs, but the thing that cannot be denied is that commercially the GUI - 
in the form we learned to know it in the Apple Macintosh and MS Windows - was a 
success story. 

Underlying the success of the GUI there has to be a strong user preference; 
otherwise we would not have witnessed its triumph. As evidence of its superiority in 
terms of measurable usability factors is lacking, the obvious conclusion is that 
subjective satisfaction is the factor that determines consumer choices. Subjective 
satisfaction affects not only commercial success but also overall usability, as Norman 
[18] points out. 

However, the arguments for GUI are not usually based on the extremely subjective 
qualities of the GUI, such as its attractiveness. Instead, the GUI was seen as usable 
because it is based on metaphors (e.g., [1]). In this context, metaphors are introduced 
as a means of making learning easier. In virtual worlds, we constantly face new kinds 
of entities that lack real world counterparts. By creating a parallel between an already 
known (usually real-world) entity and the new entity, the learning of the nature of the 
new, it is argued, is made easier [4, 15]. This parallel is usually called metaphor. The 
same aim is often referred to as learnability or ease-of-learn. In addition to ease-of- 
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learn, the rationale for using metaphors is often a more extensive concept: ease-of-use 
[17]. 

The rationale for the use of metaphors seems to accord quite well with the 
traditional uses of metaphors in verbal communication. Aristotle [2] considered the 
strength of metaphor to lie in its ability to enliven a presentation, illustrate and clarify, 
direct emotions, and express entities with no name. In this classical view metaphors 
are seen as stylistic devices, and above all, as a means of creating associations 
between the known and the new. However, when the classical concept of metaphor 
has been used in the context of GUIs, the content of the concept has changed. Current 
usage of the concept in GUI design keeps the superficial characteristics of the 
traditional meaning but discards something very essential. This has led to inflation in 
the power of metaphors. 

This study is a conceptual analysis of metaphor and an attempt to sharpen its 
usage. The work is motivated by the need to support conceptualisation processes 
when users face novel designs, especially in the context of ubiquitous computing. 
Since many of the approaches in human-computer interaction were created for the 
needs of desktop computing, novel approaches are needed and metaphors are 
proposed as a central means in rising to the challenge. 

As this is also a defence of metaphor as a design principle, we start by reacting to 
two familiar arguments against the use of metaphors in GUIs. After having 
formulated our view of metaphors (and related concepts), we describe a sample 
design, a portable music player, in order to illustrate our ideas. The sample design is 
discussed in terms of an evaluation we carried out. 



2 Concept of Metaphor in the Context of UI Versus in Rhetoric 

2.1 Argument 1: “Metaphor Can Never Cover the Whole Domain of Its 
Referent” 

The most familiar argument against the use of metaphor as a design principle is that in 
a metaphorical setting, the virtual entity and its real world counterpart differ from 
each other. It is argued that this difference misleads the interpreter of the metaphor 
(user). It is further argued that when trying to imitate a real world entity with a virtual 
artefact, the functionality of the virtual entity is restricted [8, 10, 16]. 

Creating a metaphor means finding analogies between the known and the new. 
Certainly, a metaphor can never have all of the properties of the entity it refers to - 
and vice versa. However, it has to be noticed that the endeavour toward similarity 
between a metaphor and its referent is unique in the Ul-context. Elsewhere, it is 
actually the differences that are seen as the strength of a metaphor. Hamilton [7] 
argues that the mismatch is the core of a metaphor. She writes how the mismatch 
makes one pay attention to “parallels not immediately apparent from the direct 
comparison”. Referring to the same thing, Carroll and Mack [4] write about the open- 
endedness of metaphors and find this a strength, essential for the stimulating effect of 
a metaphor: “It is this property of metaphor that affords cognitively constructive 
processes which can lead to new knowledge. From the perspective of active learning, 
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the open-endedness of these kernel comparisons is intrinsic to the mechanism that 
allows them to work” (p. 395). 

The idea that a metaphor and its referent should resemble each other as much as 
possible collides with all central definitions of metaphor. A good example of the 
weirdness in the usage of the concept of metaphor in GUIs is a push-button. No 
doubt, most people who have followed the development of GUIs, would call a virtual 
push-button a metaphor. In the contemporary GUI, a virtual push-button looks (even 
when pushed) and possibly sounds like its physical counterpart. With the 
development of computer graphics, a push-button has been developed to resemble 
more and more a physical one. This tendency would end up in a situation where the 
difference between a virtual and real push-button can not be perceived. Aristotle [2] 
declared the ability to create good metaphors as “a sign of genius”, an innate talent to 
see “similarity in dissimilars” (p. 2335). He would hardly have found a virtual push- 
button with its obvious similarities to be a metaphor at all. 

If we are trying to imitate a real world entity as accurately as possible, we have a 
good concept already in use. The Oxford English Dictionary (OED) defines the verb 
“simulate” as follows: “To imitate... by means of a model...” The OED’s description 
of the corresponding noun simulation is in accordance with this. We conclude that 
when striving towards the highest possible similarity between a virtual and a real 
object, we are talking about simulations, not metaphors. 

Simulations and metaphors should be distinguished conceptually from each other 
in design. This is simply because they have different strengths and should be used 
accordingly. However, making the distinction between these two in practical design is 
not at all that simple. These problems are discussed later with reference to our sample 
design. 

We conclude that two important features of metaphors are neglected when 
flattening the use of the concept of metaphor to imitation of real-world entities: 

1. The first argument concerns the designer. We argue that a working metaphor 
demands from the designer creativity, inventiveness, and a deep view of human 
mental processes. Creating a metaphor that is simultaneously appropriate (makes 
essential qualities salient) and unexpected (stimulating) demands much more than 
simple imitation of the most obvious point of comparison in the real world. 

2. The second argument deals with the user. The key question is: do we see the user 
as a passive receiver of the ideas of the designer or do we count on the active 
mental processes of the user? Again, we refer to Carroll and Mack [4], who 
describe metaphors as a way to make users pose problems for themselves. In other 
words, rather than using metaphors as a means to transfer knowledge or 
understanding from one person to another, this view underlines their role as 
inspiring a user’s own imagination and creativity. 

Giving a role to something like the user’s imagination and creativity might sound 
frightening in its uncertainty and indefiniteness. It sounds safer to search for strategies 
that provide methods to control the meaning construction process, rather than inspire 
it. Still, whether we acknowledge it or not, constructing meanings and mental 
representations is always subjective in nature. The process is tied to previous 
experiences and other subjective qualities. 

The difference outlined between the traditional meaning of metaphor and its 
somewhat loose use (when imitating real world entities) has important implications. 
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When creating a simulator, e.g., a flight simulator, we have a clear target: our model 
is perfect when the user of the simulator cannot sense the difference between it and its 
real-word counterpart. Therefore, a simulator is in practice always a more or less 
imperfect substitute for something else. It is used, for example, because of safety or 
economy. But the simulation itself can hardly ever equal what it simulates. Metaphor, 
in the sense it is traditionally used, does not even have this kind of concrete 
counterpart it could be compared with. Since the aim of a metaphor is to make 
something essentially salient by drawing a parallel between contextually unrelated 
entities, the success of a metaphor cannot be evaluated in as simple a manner as the 
success of a simulation. Whether a metaphor works or not is dependent on both the 
metaphor and its interpreter. So there is no theoretical end-point called ’the perfect 
metaphor’. For one person it might work just like its creator wished, for another 
person the same metaphor might fail totally. To express it in another way: To develop 
a simulation is a highly mechanical reproduction process. In turn, to develop a 
metaphor is a creative communication process. 

2.2 Argument 2: ’’Once Learned, the Metaphor Becomes Useless” 

The strength and rationale of metaphors in GUI’s, according to the usual argument, is 
its power in facilitating learning of the use of the computer [4, 7, 16]. However, we 
may ask (see [6, 10]): why use the metaphor after having learned to use the 
application? 

Cooper [5, p. 118] analyses the concept of dead metaphor. Even a brilliant 
metaphor may gradually become an idiom, thus losing its literal meaning. This idea 
could be applied to UI design when creating either verbal or non-verbal metaphors. 
Why couldn’t we let a metaphor die in peace? A durable metaphor could then have a 
different function for a novice user and an expert. For a novice user it could be a 
metaphor (in the traditional meaning), providing insights into the nature of a function 
or application. Gradually, it turns into idiom for the experienced user, still having 
some communicative value. 

Dead metaphor, or metaphor which has turned into idiom, is something that has 
been born as a metaphor. The connection to the source of the metaphor has supported 
it for a period of time, but gradually the need to maintain the associations with its 
source is reduced. Finally, it becomes totally independent of its ‘parent’ and lives its 
own life. The only thing (if any) that reminds us of its roots is perhaps its name or 
other symbolic presentation. Later, this new, independent concept may be a source for 
- give birth to - a new metaphor. 

Again, we ended up with the subjective nature of metaphors. Being coherent in our 
concepts, we shouldn’t actually even talk about the creation of metaphor as the task of 
the designer. Rather, the challenge of the designer is to support the user’s metaphor 
creation process. This way we also turn from the classical or Aristotelian metaphor 
conception to what is usually referred to as modern theory of metaphor [12, 13, 14], 
In it, metaphor is seen as a key means for a human being to construct knowledge. 

From now on, we shall still refer to metaphors and simulations as if they were 
something that could be designed. The term metaphor should therefore be interpreted 
merely as “a support for metaphor creation” or “metaphorical expression”. The same 
concerns simulation in our conceptual analysis, even if simulations can be understood 
as well much more mechanically. 
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3 Sample Design: The GestureJukeBox 

We now apply the proposed metaphor-approach to a sample design, a portable music 
player. The aim is to analyse the design process and to evaluate the usability of the 
implementation. 

We implemented a prototype of an mp3-player, GestureJukeBox, on an iPaq hand- 
held computer. The original idea was to find interaction techniques that would be 
useful in mobile environments. We ended up by using simple gestures across the 
touch screen of the iPaq as the input method and providing feedback in the form of 
non-speech stereophonic sounds, earcons, following systematic guidelines [3], This 
was because these could be used on the move, without looking at the device. Gaze 
was thus released for other use. The reasons for choosing a portable music player as 
the application were that 

1 . it could be used with a small number of commands. The number of simple gestures 
on the touch screen that could be reliably distinguished from each other is rather 
small, and 

2. a concept of a portable music player and its standard functions could be assumed to 
be familiar to most of the potential users. Thus, the mental representations could be 
assumed to be based on previous experiences of similar devices. The assumed 
representations could be used as one of the determinants of the design. 

The design and evaluation of GestureJukeBox was preceded by TouchPlayer, which 
was implemented in the same technical environment and had the same aims as 
GestureJukeBox. The experience gained from the development and evaluation of 
TouchPlayer [20, 21, 22] was utilised in the development of GestureJukeBox. 



3.1 Design of Interaction 

This study focuses on seven basic functions of GestureJukeBox. These are: 

1. Play/Pause (Stop). These two functions are linked together just like they are in all 
music players: If the music is stopped, the available function is ‘play’ and vice 
versa. Therefore, just as ‘play’ and ‘pause’ functions are controlled with a single 
button in many players with function buttons, in our design they are controlled 
with the same gesture. The chosen gesture is a double tap anywhere on the touch 
screen. In our previous design, we used a single tap, which was easier to perform 
but was less reliable with regard to unintentional touches (especially in real-life 
conditions) and misinterpreted gestures. The feedback was provided in the form of 
a short open hi-hat sound. 

2. Next track. The function shifts the pointer in the play list to the beginning of the 
next track. In the case of the last track, the pointer goes to the beginning of that 
track. It is also possible by adjusting the settings to make a looped play list, so that 
the track after the last one is the first track. The required gesture is a horizontal 
sweep across the screen from left to right. The feedback sound is an electronic 
sound, which is ascending in pitch and panned to move from left to right. Thus, 
both the pitch and the direction of the sound illustrate the move forward. 
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3. Previous track. The function works like ‘next track’ but in the opposite direction as 
far as play list, gesture, and feedback sounds are concerned, i.e., the input gesture 
is a sweep from right to left, the feedback sound descends in pitch and moves from 
right to left. The only difference in logic is that if the pointer is not at the beginning 
of a track, the first sweep backwards shifts the pointer to the beginning of that 
track. If the intention is to go to the previous track, another sweep is required. 

4. First track. The pointer is moved to the beginning of the first track in the play list. 
The corresponding gesture is a similar sweep as in the ‘next track’ command, 
followed by a tap anywhere on the screen immediately after the sweep. The 
feedback sound is like in the ‘next track’ command, (ascending, panned from left 
to right) but instead of an electronic sound it is a piano roll. 

5. Last track. The function shifts the pointer to the beginning of the last track. 
Interaction properties are like in ‘first track’ but both input and output directions 
are opposite. 

6. (/7) Volume up/down. Volume control is performed by circular gestures, either 
clockwise (volume up) or counter clockwise (volume down). There is no explicit 
feedback sound if the track is in play mode, since the decrease or increase in 
volume was supposed to be clear enough audio feedback for the successful action. 
However, if the volume is adjusted in the pause mode, there is a separate feedback 
sound: a chord of organ sounds, with the number of notes in the chord indicating 
the resulting volume level. 

3.2 Sample Design from the Perspective of Metaphors 

We used several, qualitatively different kinds of metaphors in the design of 
GestureJukeBox. In order to make the metaphors work, the device has to be attached 
on the hip, on the right side of the user (for left-handed users, it can be on the left and 
all the gesture directions are changed from the settings of the application). The device 
has to be in a relatively upright position. The audio output has to be listened to with 
headphones in order to stabilise the directions of panned sound. 

3.2.1 Gestures 

In this setting (Figure 1), the physical forward-backward directions were equated with 
directions in the play list. A sweep forward leads forward in the list, a sweep 
backwards causes the pointer to go backwards. In terms of the traditional usage of the 
concept of metaphor it can be argued that this parallel between the physical directions 
and abstract directions in the play list is in accordance with the definitions of 
metaphor described above. Even if this gestural metaphor is non-verbal in nature, it 
clearly works analogically to its verbal counterparts. 

The gestures related to the ‘first track’ and ‘last track’ commands are more 
complicated to understand as metaphors. The corresponding visual symbol in standard 
control panels is an arrow that ends in a bar, and that was the visual representation 
behind the gesture design (horizontal sweep + tap). If we interpret the gesture only as 
an imitation of its visual counterpart, we do not have a proper basis for calling the 
gesture a metaphor. However, it is as questionable to call it a simulation, since the 
gesture still has abstract, metaphorical components. The problem of labelling the 
nature of this gesture is discussed later. 
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Fig. 1. Some gestures and directions 



Interpreting the gestures for increasing and decreasing the volume in a similar way 
is ambiguous. To be precise, the development that led to the circular gestures started 
from our previous design. In it, volume control gestures were vertical sweeps across 
the screen. However, in that design we faced a gear problem: If volume could be 
adjusted from zero to full with one single sweep, the sweep proved to be too short for 
adjusting to exactly the right level. We then used sweeps that only slightly affected 
the volume level, but then the number of required sweeps was often large. So we 
ended up (in GestureJukeBox) with a solution, in which we extended the length of the 
possible sweep by rolling a single but long sweep up. However, since the user is 
unaware of this evolution, he or she would probably associate the circular volume 
control with something else. The user could associate the circular adjustment with the 
very familiar potentiometer with its knob. Since that kind of technology is the most 
usual way of controlling volume in all kinds of players, radios etc., the circular 
volume control gesture can be interpreted as a knob- simulation. This interpretation 
can be supported by the assumption that if it had been possible to construct a large 
physical knob, it would probably have been more usable for the needs of volume 
control. Just like in all simulations, this simulation thus was only a substitute for its 
real-world counterpart. 

Understanding a tap, not to speak of the double tap that was used in our design, in 
terms of metaphor theories is more complicated. We should probably go back to the 
metaphors of physical push buttons, and a tap would then be seen as a metaphor of a 
metaphor or even further. In this paper, we concentrate on simpler cases. In order to 
contribute to the conceptual analysis of metaphors in UIs we focus on the comparison 
between the clearly metaphorical sweep directions and the circular volume control. 

3.2.2 Audio Feedback 

The feedback sounds of GestureJukeBox are designed to illustrate the directions in 
the play list. As mentioned above, the same information is illustrated in changes in 
pitch and panning. But why did we choose to illustrate reversing by decreasing pitch 
(and vice versa for going forwards)? Why did we choose to pan leftwards when 
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referring to steps backwards (and vice versa for forwards)? Our aim was to imitate the 
order of buttons in a standard music player control panel. So this design had clearly 
more properties of a simulation than a metaphor. 



3.3 Evaluation 

We evaluated our design from the point of view of learnability. Learnability would 
provide us with the basis for making assessments about the quality of metaphors and 
simulations we used. Since learnability is also a key factor in usability, we could 
receive important information about overall usability. 

In the experiment the participants were to perform a set of tasks. We had ten (four 
female and six male) participants. All were undergraduate computer science students. 
The form of data collection was digital video. In order to obtain very accurate and 
detailed information about the behaviour of the users, we needed to use two cameras. 
One of them was for close-up pictures and the other one for an overall view. A close- 
up picture of the device and the controlling finger was essential in order to track the 
user’s actions in detail. As this kind of setting is extremely complicated to organise in 
field conditions, we decided to use a laboratory and simulate a mobile setting. 

In our previous study [20, 21, 22] we used a simple mini-stepper exercise machine 
to simulate real-life conditions and make the participants perform movements that are 
typical in walking or climbing stairs. However, since the mini-stepper left hands free, 
the participants tended to hold the device in the controlling hand, waiting for the next 
task. In real life conditions, the device is supposed to be controlled in various 
conditions, even when the hands are involved in other tasks and the device is touched 
only when performing a control task. So we changed the setting of the experiment so 
that the hands are occupied elsewhere and used to control the device only when asked. 
We chose an exercise bike because, when pedalling it, the user usually leans on the 
handlebars (Figure 2). 

The participants were supposed to pedal continuously, listening to the music from 
the GestureJukeBox, and to simultaneously carry out simple tasks with the player. 
The instructions for the tasks were presented on a screen in front of the participants. 
There were ten different kinds of tasks: ‘Play’, ‘Stop’, ‘Next track’, ‘Previous track’, 
‘First track’, ‘Last track’, ‘Increase volume’, ‘Decrease volume’, ‘Count the number 
of tracks’, ‘Find your favourite track’. These tasks followed each other in varying 
order in a 20-25-minute session, resulting in dozens of attempts of each kind, except 
the tasks ‘Count the number of tracks’, which was only performed once, and ‘Find 
your favourite track’, which was performed twice. During the test session the total 
number of registered user actions ranged from 137 to 195. The variation in the 
number of actions was partly due to tasks that required a varying number of actions, 
such as ‘Find your favourite track’, partly because each retrial after an unsuccessful 
action was registered separately. 

Each user action was recorded along with information about its timing and its 
success. From the video data, it was possible to decide whether a certain action was 
successful or not. Video analysis provided a clear advantage over automated data 
collection methods using the input device (e.g. creating log files). The main advantage 
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was that the video analysis allowed detection of actions which the touch screen itself 
did not recognise. 

To illustrate the learning process, each user action was extracted from the data and 
the success percentage of the five most recent actions related to the same function was 
counted. For example, if 3 of the 5 most recent attempts to perform the function 
‘play/stop’ were successful, the success percentage at that point was 60. This 
percentage was supposed to predict the success of the next action. Because of the 
method, the counting started from the fifth attempt at each kind of function. 

We call the resulting graphs “learning curves” since the form of the curve 
illustrates the learning process. Figure 3 shows the learning curves of one participant. 
The first graph tells us that each attempt to perform the function ‘Play / Stop’ was 
successful. The success percentage was at the 100% level throughout the session. The 
‘Next track’ function at first succeeded for a while, but then the success percentage 
dropped drastically, though recovering rapidly. The tendency was similar in the 
‘Previous track’ function, though at a lower level. Volume adjustment functions were 
consistently at a relatively high level. The curves of the functions shifting the pointer 
to either end of the playlist (‘First track’ / ‘Last track’) showed improving success at 
the beginning of the session and reduced success later. 

The learning curves of the different subjects differed markedly from each other. We 
combined the data concerning each function and found that the variations among 
learning curves smoothed out all interesting data from combined learning curves. 




Fig. 2. Collection of video data 
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Volume up Volume down First track Last track 



Fig. 3. Sample set of learning curves 

In the interpretation of learning curves it is essential to understand the nature of the 
learning process. Before the first trial, the participant was given very short 
instructions about the functions. At the beginning of the session, we could observe 
whether the participant then tried to act in an appropriate way. For example, if the 
participant was given a task ‘Next track’ and we could observe from the close-up 
picture that he or she performed a sweep forward, we could conclude that the 
metaphor worked and its use resulted in the desired behaviour. In addition to the 
logic, the participants had to learn the physical properties of the touch screen. Most 
failures were finally caused by the insensitivity of the device. The touch screen is 
optimised for use with a stylus. When we used it with a finger, it was often difficult to 
estimate the required amount of pressure. Very subtle details like the shape of a nail 
could have a dramatic effect. We even used a finger plectrum to overcome this 
problem, but in the end it took a while to get used to the sensitivity of the input 
device. 

In this paper, our focus is not on the analysis of learning process. Here, we 
concentrate on the discussion of the qualitative differences between the metaphors 
and simulations in the UI of our sample design. In that sense, there were two key 
findings. 

First, the shapes of the learning curves could be grouped in most cases just as they 
are grouped in Figure 3. The curves concerning functions that were based on a similar 
metaphor or simulation resembled each other. This supports our hypothesis about the 
qualitative differences between metaphors and simulations. 

The second key finding is the variation among participants. It supports the idea that 
the same UI element may be processed quite differently with different users. As 
discussed earlier, the same element might work as a metaphor for one and as a 
simulation for another. In addition, the gradual conversion from metaphor (or 
simulation) to an idiom might be at a different stage. Figure 4 illustrates this 
complexity. First there is a horizontal dimension, referring to the continuum between 
metaphor and simulation. A virtual entity may be located anywhere between these 
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extremes, and the location is highly 
dependent on the interpreter. Wherever 
the interpretation may be in that axis at 
the beginning, it is bound to turn into 
idiom in the course of time, or more 
precisely, in the course of use. The 
triangular shape illustrates the fact that 
the more the entity has the character of 
an idiom, the less its origin (whether a 
metaphor or simulation) matters. 



4 Conclusions and Discussion 

Our study started from theoretical reasoning about the nature of metaphor, simulation 
and idiom. With the help of a sample design, we concluded by describing the 
relationships among those concepts. The resulting model has two kinds of 
implications. First, the model can be used to understand the user’s way of adopting 
new concepts. For example, this gives us a theoretical framework when trying to 
understand one of the participants of our experiment, who consistently tried to go 
forward in the playlist using a backward sweep (and vice versa). Did this virtual 
control element have some other metaphorical meaning for her than for the designer 
and all the other participants? Or was it rather a simulation of some kind for her? 
Second, the model can help the designer to express himself / herself. When 
constructing a virtual entity, the designer is bound to have a strong mental 
representation of the nature and properties of that entity. If the nature of the entity is 
to substitute for something that is familiar to a user, the designer might consider 
simulating the familiar thing. However, the more new properties there are compared 
to familiar entities, the more appropriate it is to use metaphors. Finally, whether the 
endeavour is a quest for metaphor or simulation, the designer should keep in mind the 
durability of the properties. Sooner or later the new concept will be an established part 
of the users conceptual system (idiom) and it should then still work. 

The way we think about ubiquitous computing in the future is firmly connected to 
the concepts that are created today. By understanding human conceptualisation 
processes - like metaphor creation and metaphor’s lifecycle - we are able to develop 
something genuinely novel. 




Idiom 



Fig. 4. The relations among the 
central concepts 
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Abstract. This paper defines the problem space of distributed, migratable and 
plastic user interfaces, and presents CAMELEON-RT 1 , a technical answer to 
the problem. CAMELEON-RT 1 is an architecture reference model that can be 
used for comparing and reasoning about existing tools as well as for developing 
future run time infrastructures for distributed, migratable, and plastic user in- 
terfaces. We have developed an early implementation of a run time infrastruc- 
ture based on the precepts of CAMELEON-RT 1 . 



1 Introduction 

Technological advances in computing, sensing, and networking, are rapidly leading to 
the capacity for individuals to create and mould their own interactive spaces. Interac- 
tive spaces will be assembled opportunistically from public hot spots and private 
devices to provide access to services within the global computing fabric. Interactive 
spaces will also take the form of autonomous computing islands whose horizon will 
evolve, split and merge under the control of users. With this move to ubiquitous com- 
puting, user interfaces (UI) are no longer confined to a unique desktop. Instead, UI’s 
may be distributed and migrate across a dynamic set of interaction resources that are 
opportunistically composed, borrowed and lent. As a consequence of distribution and 
migration, user interfaces must be plastic in order to adapt gracefully to changes of 
the interactive space. 

To address the problem of developing distributed, migratable, plastic UI’s, we 
propose CAMELEON-RT, a conceptual architecture reference model. This model is a 
canonical functional decomposition that can be used for comparing and reasoning 
about existing tools as well as for developing future run time infrastructures for dis- 
tributed, migratable, and plastic user interfaces (DMP-UI). The article is structured as 
follows. In the next section, we introduce the terminology to establish a common un 



1 RT stands for Run-Time. 
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derstanding of the problem space for DMP-UI. We then analyze the state of the art in 
the light of the problem space to motivate our own proposition with CAMELEON-RT 
and two case studies. We close the presentation with a discussion for future work. 



2 Terminology and Problem Space 

Our terminology covers two aspects of the problem: the capacity for people to mould 
the digital and the physical worlds into cohesive interactive spaces , and the conse- 
quence on user interfaces which, from centralized and immutable, become distrib- 
uted, migratable, and plastic. 



2.1 Interactive Space 

An interactive space is a combination of three complementary things. It includes a) 
the physical place where the interaction takes place, b) the computing, networking 
and interaction resources available at this place, and c) the digital world (or set of 
services) that supports human activities in this space. The physical place is modeled 
with attributes and functions such as location, social use, and light conditions. The 
computing, networking and interaction resources bind together the physical space 
with the digital world. In particular, an interaction resource is a physical entity that 
allows users to modify and/or observe the state of the digital world. Typically, mice 
and keyboards are used as input interaction resources (we call them instruments ), but 
phicons are instruments as well. Display screens are used as output interaction re- 
sources (we call them surfaces). Augmented tables and rain curtains are surfaces as 
well. The interaction resources of an interactive space are managed by a platform. 

The platform is elementary when it is composed of one computer. It is a cluster 
when it is assembled from a set of computers. The assembly may be static (the con- 
figuration cannot be modified on the fly) or dynamic. When dynamic, an elementary 
platform or an interaction resource may arrive or disappear. Alternatively, the set of 
interaction resources may stay the same but the relationships between them, such as 
the orientation of the surfaces, may change. The cluster is homogeneous (it is com- 
posed of identical elementary platforms) or heterogeneous (the resources and/or the 
operating system of the constituents differ). 

For example, the I-land’s DynaWall [19] is a static homogeneous cluster: it is 
composed of three interconnected electronic whiteboards controlled with the same 
underlying infrastructure, Beach[20]. On the other hand, the ConnectTables of I-land, 
where two identical tablets running the same system can be plugged together, consti- 
tute a dynamic homogeneous cluster [21], Pebbles [12] supports the construction of 
dynamic heterogeneous clusters where multiple PDA’s can be connected on the fly to 
control the display surface of a workstation. In iRoom [9], the cluster is a dynamic 
assembly of workstations that runs a mix of Windows and Unix. Within an interactive 
space, we need to consider how UI’s are distributed, how they can migrate and sup- 
port plasticity. 
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2.2 UI Distribution 

A UI is distributed when it uses interaction resources that are distributed across a 
cluster. For example, in graphical UI’s (GUI), the rendering is distributed if it uses 
surfaces that are managed by different elementary platforms. The granularity of the 
distribution may vary from application level to pixel level. 

At the application level, the GUI is fully replicated on the surfaces managed by 
each elementary platform. The x2vnc implementation of the VNC protocol offers an 
application level distribution. At the workspace level, the unit for distribution is the 
workspace, A workspace is a compound interactor that supports the execution of a set 
of logically connected tasks. PebblesDraw [12] and Rekimoto’s Pick and Drop [17] 
are examples of UI distribution at the workspace level. The interactor level distribu- 
tion is a special case of the workspace level where the unit for distribution is an ele- 
mentary interactor. At the pixel level, any user interface component can be partitioned 
across multiple surfaces. For example, in the DynaWall, a window may simultane- 
ously lie over two contiguous white boards as if these were managed by a single 
computer. 



2.3 UI Migration 

UI migration corresponds to the transfer of all or part of the UI to different interaction 
resources whether these resources belong to the current platform or to another one. 

Migration is static when it is performed off-line between sessions. It is dynamic 
when it occurs on the fly. In this case, a state recovery mechanism is needed so that 
users can pursue their activities in a seamless manner. In addition, the migration of a 
user interface is total if the user interface moves entirely to a different platform. It is 
partial when a subset only of the user interface moves to different interaction re- 
sources. For example, on the arrival of a PDA, the control panels currently rendered 
on a whiteboard migrate to the PDA. 

Migration and distribution are two independent notions: a UI may be distributed 
but not migratable, A centralized UI may migrate to a cluster and distributes itself 
across the interaction resources of the new platform. A priori, the most powerful UI’s 
are those that are both dynamically distributable and migratable. However, migration 
and distribution may in turn require the UI to be plastic. 



2.4 UI Plasticity 

The term plasticity is inspired from the capacity of solids and biological entities such 
as plants and brain, to adapt to external constraints to preserve continuous usage. 
Applied to HCI, plasticity is the capacity of an interactive system to adapt to changes 
of the interactive space while preserving usability [2]. Usability is defined as a set of 
properties [pi,..., p;, ..., p n ] (e.g., observability, predictability [7]) such that, for each 
p ; , a metrics is defined with a domain of values d,). The job of a plastic UI is to 
maintain the set of properties within their domain of values. Given a user interface UI 
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and its usability U = {(pj, d ..., (p b dj, ... (p n , d n )}, the domain of plasticity of UI is 
the set of situations S for which UI is able to preserve U. 

Early work on UI plasticity demonstrates that UI migration and distribution be- 
tween very different platforms go far beyond software portability. For example, the 
static migration on a mobile phone of an application designed for a workstation may 
result in multiple forms of adaptation. This may range from replacing graphics inter- 
actors with vocal interactors, to restructuring the dialogue or even suppressing serv- 
ices [22]. Retargeting a UI may be static and/or performed at run time. When retar- 
geting is static, a set of pre-computed concrete user interfaces (CUI) is produced in 
advance for a predefined set of situations. 

The production of CUI’s by reification is one approach to the problem: typically, 
the process starts from high-level descriptions such as task and domain models to 
produce an abstract UI (AUI), then from an AUI, produces one or multiple CUl’s. 
Alternatively, an existing CUI my be reverse-engineered by means of abstraction to 
obtain an AUI and/or a task model. These abstract representations are then translated 
to fit the new target, then reified into new CUI’s. 

Having defined the dimensions of the problem space for DMP-UI’s, we need to 
identify how the software tools of the state of the art address the problem. 



3 Analysis of the State of the Art 

UI plasticity is supported for centralized, statically migratable UI’s only. No tool 
addresses UI distribution and no tool supports on-the-fly migration of the user inter- 
face between platforms. Migration can only occur between sessions. Development 
tools like Teresa [15] and ArtStudio [2] pre-compute CUI’s from high level specifi- 
cations. They are completed with reverse-engineering tools like Webrevenge [14] and 
Vaquita [23] that proceed with a combination of abstraction, translation, and reifica- 
tion. Digymes [4] and Icrafter [16], on the other hand, generate CUI’s at run time 
where a Tenderer dynamically computes a CUI from an AUI. 

Websplitter[8] supports the distribution of web pages content at the interactor level 
across the interaction resources of heterogeneous clusters, but distribution is statically 
specified in an XML policy file. In Pebbles, the types of interaction resources are 
known and the distribution of the UI is statically assigned based on this knowledge: 
the public screen of a workstation contains a shared workspace whereas PDA’s con- 
tain a panel to control the shared display. In iRoom, mouse pointers can migrate be- 
tween the screens of the cluster. However windows are confined to the screen where 
they have been created. Beach, on the other hand, supports the dynamic distribution 
and migration of UI’s at the pixel level, but the cluster is static and homogeneous. 

This brief overview of the state of the art reveals that no software tool currently 
supports all aspects of distributed, migratable and plastic user interfaces. With 
CAMELEON-RT, we propose an architecture reference model that integrates all of 
the functional components necessary to support the dynamic distribution, migration 
and plasticity of UI’s across dynamic heterogeneous clusters. This canonical concep- 
tual architecture can then be instantiated in different ways to implement run-time 
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infrastructures adequate for a subset of the problem space of DMP-UI’s. This is what 
we have done with the development of two very different case studies. 



4 Case Studies: CamNote and I-AM 

CamNote (for CAMELEON Note) is a slides viewer that runs on a dynamic hetero- 
geneous platform. This platform may range from a single PC to a cluster composed of 
a PC and a PDA. I-AM is a platform manager similar in spirit to X-Window, but that 
supports dynamic heterogeneous clusters of workstations. 



4.1 CamNote 



The UI of CamNote includes four workspaces: a slides viewer, a note editor for asso- 
ciating comments to slides, a video viewer also known as pixels mirror that shows a 
live video of the speaker [24], and a control panel to browse the slides and to setup 
the level of transparency of the pixels mirror. As shown in Figure 1 , the pixels mirror 
is combined with the slides viewer using alpha-blending. Speakers can point at items 
on the slide using their finger. This means of pointing is far more compelling and 
engaging than the conventional mouse pointer that no one can see [11], 




Fig. 1. (a) The user interface of CamNote when distributed on a PC and a PocketPC screens; 
(b) the control panel when displayed on the PC screen 

Figure la shows a Pebbles-like configuration where the graphical UI is distributed 
across the surfaces of a PC and of a PDA. The slides viewer is displayed in a rotative 
canvas so that it can be oriented appropriately when projected on an horizontal sur- 
face (e.g., a table). If the PDA disappears from the cluster, the control panel auto- 
matically migrates to the PC screen. Because different resources are now available, 
the panel is plastified. As shown in Figure lb, the retargeted control panel includes 
different interactors, but also a miniature representation of the speaker’s video is now 
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available. During the migration-retargeting process, users can see the control panel 
emerging progressively from the slides viewer while rotating so that they can evaluate 
the state of the transition. The UI, which was distributed on an heterogeneous cluster 
is now centralized on an elementary platform. The new UI results from a dynamic 
partial migration and retargeting at the workspace level. Conversely, if the PDA re- 
enters the platform, the UI automatically switches to the configuration of Figure la) 
and the control panel disappears from the PC screen by weaving itself into the slides 
viewer before reappearing on the PDA. 



4.2 The Interaction Abstract Machine (I- AM) 



I-AM (Interaction Abstract Machine) supports the dynamic configuration of interac- 
tion resources to form a single logical interactive space [1], These resources are man- 
aged by different elementary workstations running distinct operating systems (i.e., 
MacOS X, Windows NT and XP). Users can distribute and migrate user interfaces at 
the pixel level as if these UI’s were handled by a single computer. This illusion of a 
unified space is provided at no extra cost for the developer who can re-use the con- 
ventional GUI programming paradigm. 

Figure 2 shows early examples of interaction techniques that allow users to control 
the platform of their interactive space. Figure 2a corresponds to the situation where 
two applications are running on two independent workstations. A closed blue border 
outlines the screens to denote the absence of coupling. In Figure 2b, the screens are 
now coupled to provide the “single display area” function. A blue border outlines the 
display area and a gateway shows where interactors can transit between the screens. 





Fig. 2. (a) The PC and the Macintosh are decoupled and run two applications, (b) The two 
screens are coupled by bringing them in close contact to form a single information space. 
(Outlines have been artificially enhanced on the pictures to increase readability.) 

Within an interactive space, any instrument can be used to modify any interactor. For 
example, in the configuration of Figure 2b, a PC mouse can be used to move a win- 
dow created on the Macintosh and migrate it to the PC. Or the two mice can be used 
simultaneously. The user can select a text field interactor displayed on the Macintosh 
screen with the PC mouse. Text can now be entered with the PC keyboard. If the text 
field is selected with the Macintosh mouse, text can be entered with the Macintosh 
keyboard as well. I-AM supports the dynamic configuration of clusters of worksta- 
tions running different operating systems, it supports the dynamic migration and 
distribution of UI at the pixel level at no extra cost for the programmer, but it does 
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not support plasticity. I-AM and CamNote, although very different in terms of their 
functional coverage, comply with the principles of CAMELEON-RT. 



5 The CAMELEON-RT Architecture Reference Model 



As shown in Figure 3, CAMELEON-RT is structured into three levels of abstraction: 
at the two extremes, the interactive systems layer and the platform layer; at the core 
of the architecture, the Distribution-Migration-Plasticity middleware (DMP- 
middleware) that provides mechanisms and services for DMP UI’s. 




Fig. 3. The CAMELEON-RT architecture reference model. A flower-like shape. O', denotes 
open-adaptive components. The miniature adaptation-manager shape, BUO, denotes close- 
adaptive components. Arrows denote information flow, and lines bi-directional links. 



5.1 The Platform Layer 

The platform layer corresponds to the notion of platform as defined in Section 2. It 
includes the hardware and the legacy operating system(s), which, together, form the 
ground-basis of an interactive space. The hardware denotes a wide variety of physical 
entities: surfaces and instruments, computing and communication facilities, as well as 
sensors and actuators. 



5.2 The Interactive Systems Layer 

This layer includes the interactive systems (e.g., CamNote) that users are currently 
running in the interactive space. The Meta-User Interface (meta-UI) is one of them. 
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The meta-user interface is to interactive spaces what the desktop is to conventional 
workstations: it binds together the activities that can be performed within the interac- 
tive space and provides users with the means to configure, control and evaluate the 
state of the space. In practice, the meta-UI brings together the user interfaces of all of 
the DMP-middleware components in a unified manner. For example, the interaction 
techniques used to couple two surfaces is part of the meta-UI. In CamNote, the ani- 
mation that allows users to evaluate the migration and adaptation process of the con- 
trol panel is also part of the meta-UI. As for the UI of any application running in the 
interactive space, a meta-UI is DMP. 

As discussed in 2.4, a DMP UI is characterized by a domain of plasticity. This 
means that the UI embeds mechanisms for self-adaptation as long as the requirements 
for adaptation lie within its domain of plasticity. The UI is said to be close-adaptive 
for these situations. For situations that cannot be handled by the UI alone, the UI must 
be open-adaptative so that the DMP-middleware layer can take the process over. The 
UI is open-adaptative if it provides the world with management mechanisms. Man- 
agement mechanisms include self-descriptive meta-data (such as the current state and 
the services it supports and requires), and the methods to control its behavior such as 
start/stop and get/set- state. Software reflexivity coupled with a component model is a 
good approach to achieve open-adaptiveness [13]. Close-adaptiveness and open- 
adaptiveness both comply with the four-step process presented in 5.3.4: observe the 
world, detect situations that require adaptation, compute a reaction that satisfies the 
situation, and generate a new UI. 



5.3 The DMP Middleware Layer 

The DMP-middleware layer aims at satisfying three classes of requirements of our 
problem space: modeling the physical space, supporting dynamic heterogeneous 
clusters, and UI adaptation when distribution and migration occur. To each of these 
requirements corresponds a service of the DMP-layer: a context infrastructure, a plat- 
form manager along with its interaction toolkit, and an open-adaptation manager. 

5.3.1 The Context Infrastructure 

The context infrastructure allows the interactive space to build and maintain a model 
of the physical place. From sensor data as well as from low-level operating system 
events, the context infrastructure generates contextual information at the appropriate 
level of abstraction. The Context Toolkit [6] and the Contextors [3] are example of 
tools that can be used to implement a context infrastructure. 

5.3.2 The Platform Manager and Interaction Toolkit 

The platform manager and the interaction toolkit play the same functional role as the 
X-window environment, but extended to dynamic heterogeneous clusters. This in- 
cludes: a) supporting resource discovery, b) hiding the heterogeneity of operating 
systems and hardware, and possibly, c) supporting the distribution and migration of 
UI’s. The interaction toolkit may be a conventional toolkit such as Motif or Swing, 
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and/or post- WIMP toolkits such as DiamondSpin [ 1 8] and our own toolkit used in 
CamNote to develop rotative and zoomable UI’s. 

Pebbles and iRoom support requirements a) and b), whereas c) is not addressed or 
is limited to pointing instruments. In Aura, a context infrastructure is used for re- 
source discovery (as well as for modeling the physical place). On the other hand, 
Aura which addresses elementary platforms only, re-uses the native windowing sys- 
tems and toolkits as platform managers and toolkits. Beach supports b) and c) for 
homogeneous clusters only. However, it is unclear whether Beach is able to detect the 
arrival/departure of elementary platforms. 

I-AM covers the a), b) and c) requirements. It uses the Contextors infrastructure to 
discover changes of the platform. To support the migration of UI’s at the pixel level, 
I-AM maintains one logical space per interactive system. A logical space is an ab- 
stract drawable populated with logical interactors (i.e., those that the programmer has 
created with the interaction toolkit). Logical interactors are projected on the surfaces 
of the cluster into physical interactors. The projection is an affine transformation that 
takes into account the geometrical relationships between the surfaces as well their 
resolution. For example, a logical interactor that lies over two surfaces is projected 
into two physical interactors, one per surface. For input, I-AM redirects input events 
performed on physical widgets, to the logical interactors that own them. 

5.3.3 The Open-Adaptation Manager 

The Open-Adaptation Manager is a key component of CAMELEON-RT. It includes 
observers that feed into a situation synthesizer with appropriate contextual informa- 
tion. The situation synthesizer informs the evolution engine of the occurrence of a 
new situation that may require an open adaptation. If so, the evolution engine uses the 
components retriever and a configurator to produce a new UI. 

Observers serve as gateways between the “world” and the situation synthesizer. 
The platform obseiyer gathers information about the platform (e.g., a new PDA has 
arrived/has left, two surfaces have been coupled) by subscribing to the components of 
the context infrastructure that probes the evolution of the platform. The physical 
place obseiyer maintains a model of the physical place (e.g., we are in room R, or in 
street S), and the users obseiyer probes users (for instance their profile, or their posi- 
tion relative to a wall surface such that information is not projected in their back). The 
interactive systems observer subscribes to information relevant to interactive systems 
plastification. For instance, an interactive system may produce the event “current 
situation S is out of my domain of plasticity” so that opne-adaptation can take over 
the retargeting process. 

The situation synthesizer computes the current situation from information (i.e,, the 
observables) provided by the observers. A situation is defined by a set of obervables 
that satisfies a set of predicates [5]. When the cardinality of the set of observables 
changes and/or when the predicates do not hold anymore, we enter a new situation. 
For example, in CamNote, the arrival or departure of a PDA results in a new situa- 
tion. Situations form a graph. Ideally, the graph of situations results from a mix of 
specifications provided by developers, by users (using the meta-UI), or learnt auto- 
matically by the situation synthesizer. In CamNote, the graph of situations has been 
provided by the programmer. Entering a new situation is notified to the evolution 
engine. 
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The evolution engine elaborates a reaction in response to the new situation. As for 
the graph of situations, the reaction may be a mix of specifications provided by de- 
velopers and/or users (using the meta-UI), or learnt by the evolution engine. In Cam- 
Note, reactions are expressed by developers in terms of rules. For example, “if a new 
PDA arrives, move the control panel to the PDA’’. The reaction may result in retar- 
geting all or part of the UI. If so, the evolution engine identifies the components of 
the UI that must be replaced and/or suppressed and provides the configurator with a 
plan of actions. In the case of CamNote, the plan is to “replace the PC control panel 
with a PDA control panel without loosing any previous work”. 

The Configurator executes the plan. If new components are needed, these are re- 
trieved from the components storage by the components Manager. In CamNote, we 
reuse a technique developed in Information Retrieval: components of the components 
storage are described with conceptual graphs and retrieved with requests expressed 
with conceptual graphs. By exploiting component reflexivity, the configurator stops 
the execution of the “defectuous” components specified in the plan, gets their state, 
then suppresses or replaces them with the retrieved components and launches these 
components based on the saved state of the previous components. In CamNote, the 
PC control panel is replaced with a PDA control panel and its state is restored prop- 
erly so that users can continue the slides show at the exact slide number before mi- 
gration occurred. 

The components referred to in the action plan do not necessarily exist as executa- 
ble code. They may instead be high-level descriptions such as task models or AUI’s. 
If so, the configurator relies on reificators to produce executable code as in Digymes 
and iCrafter. A retrieved component may be executable, but may not fit the require- 
ments. It may thus be reversed-engineered through abstractors, and then transformed 
by translators and reified again into executable code [23]. 



5.4 Discussion 

CAMELEON-RT is a functional decomposition that covers all aspects of DMP UI’s. 
It is not an implementational architecture. In particular, we do not address the alloca- 
tion of functions across processes and processors, and we leave open the choice of 
architecture styles. However, the nature of ubiquitous computing (e.g., platform het- 
erogeneity and dynamicity, interactive islands based on ad-hoc networks), suggests 
the following heuristics: apply the principles of exo-kernels by making a clear dis- 
tinction between core functions and extension functions that can be called upon dy- 
namically and possibly remotely. By doing so, low-end elementary platforms can be 
addressed. Replace the client-server model with a P2P architecture style so that ad- 
hoc interactive islands can be constructed on the fly. Use a reflexive component- 
connector approach to support software reconfiguration. Our early experience with 
the implementation of the Contextors infrastructure, I-AM, and CamNote, demon- 
strate that these rules are viable. 
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6 Conclusion 

The analysis of the state of the art shows that current research projects address a small 
subset of the problem space of distributed, migratable, and plastic user interfaces. The 
CAMELEON-RT model provides software designers with a general framework that 
addresses both small and large scales computing environments, as well as all forms of 
UI distribution, migration and plastification. In addition, it makes explicit the notion 
of meta-user interface, an emerging notion found implicitly in the literature in expres- 
sions like “how users will configure and control their environment?”. We propose 
early heuristics to facilitate the exploitation of CAMELEON-RT for the practical 
deployment of run time infrastructures. These need to be refined and CAMELEON- 
RT must be evaluated with further experiments. 
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Abstract. Energy conservation is a critical issue in wireless sensor net- 
works. We formulate the energy conserving routing problem as a nonlin- 
ear program, whose objective is to maximize the network lifetime until 
the first node battery drains out. We prove the nonlinear program can 
be converted to an equivalent maximum multi-commodity concurrent 
flow problem and develop an iterative approximation algorithm based 
on a revised shortest path scheme. Then we discuss the feasibility, pre- 
cision and computation complexity of the algorithm through theoretic 
analysis, some optimization methods are also provided to reduce the al- 
gorithm running time. Performance simulation and comparison show the 
effectiveness of the algorithm. 



1 Introduction 

With the rapid development in low cost sensor devices, there is an increased 
interest in the deployment of wireless sensor networks, which are suitable for 
applications such as military and environment surveillance. Low cost sensors are 
typically powered by low life batteries and therefore conserving battery energy 
is a prime consideration in these networks. Since the battery energy is mainly 
depleted for data transmission, it is necessary to use energy aware or energy 
conserving routing mechanisms to extend the network lifetime. 

We investigate the energy conserving routing problem in wireless sensor net- 
works and propose an iterative approximation algorithm for it. The objective of 
our algorithm is to maximize the network lifetime under the given data collection 
rate. We formulate the energy conserving routing problem as a nonlinear pro- 
gram and prove it can be converted to an equivalent maximum multi-commodity 
concurrent flow problem. Then we adapt and extend the techniques for multi- 
commodity flow problems first proposed by Garg and Konemann [1] to develop 
an iterative approximation algorithm for our problem, theoretic analysis shows 
that our algorithm is a (1 — e) -3 approximation of the optimal solution. To re- 
duce the computation complexity, we also present some optimization methods 
for the algorithm, performance simulation and comparison show the effectiveness 
of this algorithm. 
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In our study we assume the topology of the network is relatively static and 
the data collected by each node is transmitted to a central node, which runs the 
enegry conserving routing algorithm and disseminates the calculated routing 
information to each node in the network. These assumtions are very suitable 
for wireless sensor networks, in which the network topology changes slowly and 
there is a sink which plays the central node role to collect data. 

The rest of the paper is organized as follows. In section 2, we discuss re- 
lated work to place our contributions in context. We define the problem and 
present its multi-commodity concurrent flow formulation in section 3. An itera- 
tive approximation algorithm is proposed in section 4. We discuss the feasibility, 
approximation precision and computation complexity of the algorithm through 
theoretic analysis in section 5. The algorihtm implementation and its perfor- 
mance evaluation are provided in section 6. We conclude the paper in section 7. 

2 Related Work 

Most of the literature in this area has focused on minimum energy routing [13], 
[14], [15] or energy aware routing [3], [5] [6], [7], [9] [10] [11]. Minimum energy 
routing is to minimize the consumed energy of per unit flow or packet to reach 
the destination. If all the traffic is routed through the same minimum energy path 
to the destination, the nodes in that path will be drain-out of batteries quickly 
while other nodes, which perhaps are more power sufficient, will remain intact. 
Instead of trying to minimize the consumed energy of per unit flow, energy aware 
routing is to extend the lifetime of the whole network by taking into account the 
remaining energy. Toh has proposed the Conditional Max-Min Battery Capacity 
Routing (CMMBCR) which selects the shortest path among the nodes whose 
remaining battery power is above a certain threshold. Singh et al. studied the 
different node battery power metrics and conclude that remaining energy roting 
can give significant energy savings compared with traditional hop count based 
routing. Heinzelman et al. presented a family of adaptive protocols called SPIN 
for energy efficient dissemination of information throughout the sensor network. 

Recently the maximum multi-commodity flow formulation of conserving rout- 
ing problem has attracted much attention [4], [8], [16]. Garg, et al. presents an 
excellent discussion of the current fast approximation techniques for solving the 
multi-commodity flow problem. Kalpakis et al. uses multi-commodity network 
flows to examine the MLDA (Maximum Lifetime Data Aggregation) problem 
and the MLDR (Maximum Lifetime Data Routing) problem. Chang et al. pro- 
pose a class of flow augmentation and flow redirection algorithms based on the 
multi-commodity flow formulation to maximize the lifetime of an ad hoc network. 
N. Sadagopan devises the A-MAX algorithm to maximize the data extraction in 
wireless sensor networks, which is also based on the maximum multi-commodity 
flow problem. Although the maximum multi-commodity flow algorithms can 
solve the energy conserving routing problem efficiently, it also leads to severe 
unfairness among the commodities or sensor nodes as denoted in [2], the nodes 
near the sink would monopolize the entire flow. The reason lies in the maximum 
multi-commodity flow algorithm always favorites the low cost route in the net- 
work, those nodes who are far away from the sink node will be penalized and 
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have little chance to get enough bandwidth. Gil Zussman and Adrian Segall de- 
sign a binary maximum flow algorithm (BMF) to overcome this restriction but 
the algorithm is time consuming. 

In order to eliminate the unfairness among sensor nodes while maximizing 
the network lifetime, we formulate the energy conserving routing as a maximum 
multi-commodity concurrent flow problem instead of an ordinary maximum flow 
problem as in [2] , [8] . We propose an iterative approximation algorithm for energy 
conserving routing basing on a revised shortest path scheme, which is easy to 
implement and need less running time. 



3 Problem Formulation 



Consider a directed network graph G = (V,E), Pis the node set (including the 
Sink node), V — Sink is the sensor node set. If node i can communicate with 
node j, there is a directed edge (i,j) £ E, where E is the set of all directed 
edges in the graph. We assume all the nodes in the network are homogenous 
except the Sink node, i.e. the initial battery energy and the transmission power 
of all sensor nodes are same, while there is no battery limitation for the Sink 
node. Given the data collection rate fl for each sensor node, the objective of 
energy conserving routing is to find the link flows such that the network lifetime 
is maximized. 

Suppose W denotes the initial battery energy , p is the energy consumption 
of transmitting a unit flow, the lifetime of a sensor node and of the network is 
defined as follows. 



Definition 1. (Chang and Tassiulas [8]) The lifetime of node i under a given 
flow is denoted by Ti and is given by 



Ti = 



W 

p E ho 



where flj is the flow on edge ( i,j ), the corresponding lifetime of the network 
is the time until the first battery drains-out, i.e. the minimum lifetime over all 
nodes. It is denoted by T and is given by 



T = minTj = min( — 

iev iev p 



W 



E hi 

( i,j)eE 



) 



Basing on the lifetime definition, we can formulate the energy conserving 
routing (ECR) problem as a nonlinear program. 

Problem ECR. 

maxT 

y, fij - y fk,i = fi i e V - Sink 

(i,j)£E (k,i)£E 



(i) 
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pT fij < W i € V — Sink (2) 

fij >0 

(ij)eE 

where (1) is the flow conservation constraint, (2) is the battery energy constraint 
(to simplify the theoretic analysis, we only consider the power consumption 
of transmitting data, in fact our algorithm can adapt to more complex power 
consumption models) . 

We first consider a maximum concurrent flow (MCF1) problem which is 
similar to the ECR problem. Suppose the maximum network lifetime T is given, 
and the data collection rate of sensor node is A r, : (n = dfi, i £ V — Sink, 9 is a 
constant, its value is discussed in section 4), the objective of MCF1 problem is 
to find the link flows such that A is maximized. 

Problem MCF1. 

max A 

^ .fij - ^2 fk,i = ^ r i i & V - Sink 

(ij)£E (k,i)£E 

pT fij <W i £ V — Sink 

(ij)£E 

fij >« 

(' ij)£E 

Although the objective function of ECR problem is quite different from that 
of MCF1 problem, their optimal solutions are tightly related. 

Theorem 1. Assume A max , /™ ax are the optimal solutions of MCF1 problem 
under given network lifetime T, if A max rj = /,, then T, /“) ax are the optimal 
solutions of ECR problem. 

Proof. Suppose the optimal solutions of ECR problem are T m ax and fij instead 
of T, /j“ ax when A max rj = /,. We can get Tmax > T, because there at least exist 
the feasible link flows /™ ax , which make the network lifetime equal to T. 

Since T m ax > T, at time T the remaining battery energy of all the sensor 
nodes in ECR problem is greater than 0, which implies there exists a small 
positive number S > 0, when f[ = /,; + 5, the network lifetime of ECR problem is 
nicely equal to T, on this condition the link flows are denoted as f[j. Thus /' / r j , 
f'ij are also feasible solutions of MCF1 problem, while the optimal solution of 
MCF1 is A max = fi/ri < fl/ri, which leads a contradiction. Therefore T, /j™ ax 
are the optimal solutions of ECR problem. 

There are node flows in the MCF1 problem, we need convert them to edge 
flows. Add a virtual directed edge (Sink, i) from Sink node to each sensor node 
i, the link flow of this edge is fsinkj = A r*, now the edge set is denoted as E', 
the network graph is G = (V,E’), MCF1 problem can be rewritten as follows: 
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max A 



5 ] fij- /m = ° 

( i,j)eE ' (k,i)£E' 

^ W 

T fij < — i G V — Sink 

(iJ)eE P 

kj >« 

(i,j)£E 

To remove the undetermined value T, we make the following substitutions 



D — —f 



, PT . 
d =w x 



Dsi 



. = eL, 



= dr t 



and formulate the MCF problem. 

Problem MCF. 



yy ^i,j 



ma xd 

^ Dk,i = 0 
( k,i)eE 1 



i G V 



Djj <1 i & V — Sink 

Dij >0 
(i,j)eE 

If the optimal solutions of MCF problem are d max and D- nax , according to 
theorem 1, A max r.j = = /j, i.e. T = which is the maximum 

W.D max 

lifetime of ECR problem. Similarly, /™ ax = — , we can see the ECR problem 
can be converted to the equivalent MCF problem, in the following section we 
will propose an iterative algorithm to solve the MCF problem efficiently. 



4 Approximation Algorithm 

We propose an iterative algorithm which is based on the Garg-Konemann algo- 
rithm [1], 

Before proposing the algorithm, we first consider the dual of the MCF prob- 
lem(DMCF): 

DMCF Problem. 

min ^ iji 

i^V—Sink 

Xj — Xi yi > 0 i £ V — Sink 

T i ^ ^ (x% XSink^) ^ 1 

i^V—Sink 

Hi > 0 i € V — Sink 




308 L. Zhang et al. 



Let = Hi, the DMCF problem is rewritten as follows 

min ^2 Vi 

i^V—Sink 

l(i,j) > Xi — Xj i G V — Sink 

T i ^ ^ ( x i x Sink) ^ 1 

i^V—Sink 

Hi > 0 i € V — Sink 

Suppose c(i,j ) = ril(i,j) is the cost of routing r* units of flows through 
link (i,j), consider an arbitrary path from node i to the Sink Pi = 
(i, ji, jk. Sink), the cost of routing r,; units of flows through path Pi is 

c(Pi) = Vi (l(i,j i) + l(j 1 , j 2 ) + ••• + l(jk-ijk) + l{jk, Sink)) 

= n (yi + y h + ... + y jk _ x + y k ) 

^ {%i x ji “I” x ji x j2 "f~ ••• "f~ x jk — i X k "f~ X k %Sink) 

^ 'f'i i%i %Sink) 

if each sensor node i routes r* units of flows to the Sink , the total cost is 

C — ^ ^ Til (Pi i) ^ ^ ^ T i {%i x > Sink) ^ 1 

i^V —Sink i^V—Sink 



thus for any path from node i to the Sink, under the c{i,j) = Til(i,j) metric the 
total cost to route r* units of flows is equal or greater than 1. This conclusion is 
simple but important, it implies we can use a shortest path algorithm to solve 
the MCF problem. 

Let PF 1111 be the shortest path from node i to the Sink in the metric l(i,j) 
and a be the total cost of routing n units of flows through these shortest paths. 
Y = Y Vii thus, the objective of DMCF is to minimize Y such that a > 1. 

i^V —Sink 

This is equivalent to minimize Y n . Let (3 be this minimum, i.e. 

„ . Y 

(3 = min — 
a 



here we should ensure (3 > 1, because it will be used in future theoretic analysis. 
If (3 < 1, we can increase it by choosing 9 appropriately. 

Let 

„ n l 



fi n max (/*) 

i^V—Smk 



we have 



n . Y 
i 3 = min — = min 
a 



E Khi) 

i^V—Sink 

““ E 

iEV—Sink 



> min 



n E WJ) 

i^V—Sink 

e KP.r n ^ ~ 

i^V — Sink 



> l 



where the maximum of [3 is n, i.e. 1 < (3 < n. 
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The number of nodes in the network is | V\ = n + 1, we number the sensor 
nodes as 1,2,3. . . n. The approximation algorithm proceeds in phases, each phase 
is composed of n iterations, in the kih iteration sensor node k find the shortest 
path to the Sink and route r*. units of flows along this path. 

Let y\’ k be the value of yt after the fcth iteration of the tth phase, the initial 
value y t ' = <5, 5 > 0, the choice of 5 is discussed in section 5. 

Now we propose the algorithm as follows: 

(1) t = 1, y]’° = 5 

(2) k = 1, if Y l ' k > 1 

stop iteration and exit, 
else 

while (k < n) do 

(a) find the shortest path P™ 111 from node k to Sink in the 
metric 

(b) route units of flows along P* 11111 , update the 

corresponding edge flows, Dij = + rk, ( i,j ) £ P“ ln 

(c) k = k + 1, yf k = y*’ kl (l + r^) 

(3) t = t + l, go to step (2) 

On termination of the algorithm, D h j and ( t — 1 )ri give the link flows, but 
their values may violate the capacity constraints (from fi j < ^ we get Di j < 
1), we need scale down all the flows by a factor of the maximum over-utilization 
to get a feasible solution of the MCF problem. 

5 Algorithm Analysis 

5.1 Feasibility Analysis 

Consider an arbitrary edge e in E ' , y\' k = y\' k l { 1 + fye), for every unit of flow 
routed through e, we increase its length by at least a factor 1 + e, initially, its 
length is <5, after t-1 phases, 1(e) < Y*' k < 1, thus the total amount of flow 
through e is at most over utilized by log 1+e 1 /S times of its capacity, scaling 
down Dj j and ( t — l)r, ; by a factor of log 1+£ 1/5 gives a feasible solution of the 
MCF problem. 



nmax 



D 



*i7 



logi+e l/<5 



t- 1 

log 1+£ 1/5 



5.2 Approximation Precision Analysis 

From y/’b = yf’ k ~ 1 ( 1 + fje), we get 



yj'.fc _ 



i^V—Sink 



iik 

Vi = 



i^V — Sink 



j,k - 1 



■ ns 



iGPi 



yi’ k ~ 1 = Y^- 1 + c’ i ,k ~ 1 e 



solving the above recurrence, we have 
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<,k-l 



Y j ’ n = y j ’° + 4 k 

k = 1 

since the edge- lengths are monotonically increasing, i.e. < c^ ,n , hence 

n 

Y j ’ n < Y j ’° + J2 4’ n£ = Yi ’° + eaj,n = Y j ~ 1,n + 1 



■ ea J 



k — 1 



according to (3 < we can obtain 



Y^ n < 



from Y 0,n = £nS and (3 > 1, we get 



yj-i,™ 

I _ £ 
1 0 



,• n nS nS . nS «o- n nS <=0- q 

yj.™ < r £ (i-( V" 1 < < e^ 1 -*) 

(1-e//?)- 7 1 — e//3 ^ l-e//3 1-e 



the algorithm stops when T*’ n > 1, thus 



1 < Y l ' n < 



nS e(t-l) 

-e^d-e) 



1 — £ 



which implies 



t-l-(l- £ )ln(^) 

so the ratio of the optimal solution and approximation is 

= _A_ = P < gl °gi+e < gln(l/<^) 

7 dmax “ (1 - e) In (^) “ (1 - e) ln(l + e) In (^) 

let S = (n/(l — e)) _1 / e , we have 

^ < < < fl — 

7 - (l-e) 2 ln(l + e) " (1 - e) 2 (e - £ 2 /2) " 1 J 

by linear programming duality we get 

1 < 7 < (1 - e) -3 

now it remains to choose s appropriately to meet our approximation requirement. 

Therefore, by appropriate choice of £ and 5, .D“ ax and d max can approximate 
the optimal MCF solution as close as possible. 
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5.3 Computation Complexity and Algorithm Optimization 

From section 5.2, we have 



1 < 7 



t - 1 



log 



1/5 

l+£ 



thus 



t< 1 + (3\og\+ e = jlog l + e e 

which implies the running time of our algorithm depends on (3, because 1 < (3 < 

2 n 

n, so the maximum running time is log-f/jl* T sp , where T sp is the time required 
to compute a shortest path in graph G. 

To reduce the running time further, we can first compute a 7 = 2 approxima- 
tion to (3 using the procedure outlined in section 4, if the procedure does not stop 

in | logi+J phases, it implies (3 > 2, we double r,; and continue the procedure, 
if it does not halt in an additional | log-J+J phases, we again double r t until the 
algorithm stops, this procedure requires 0( log 2 n) phases and returns an approx- 
imation value (3' . Since 7 = 2, we have (3 < f3' < 2/3, we multiply 77 by [3' and 

create a new instance which has 1 < /3 < 2, so we need at most another 2 log^J. 
phases to get the (1 — e)~ 3 approximation. Therefore the total number of phases 
is 0(log 2 n + | log^') and the running time is 0((log 2 n + 2 log^' )nT sp ). 

6 Performance Evaluation and Comparison 

Table 1 shows the performance comparison of three different energy conserving 
routing algorithms, all the algorithms are based on maximum flow theory, we 
can see the complexity of the MCF algorithm is more than A-MAX, but A-MAX 
cannot guarantee the fairness among sensor nodes; BMF is a fair algorithm but 
it is more complicated than MCF. 

Table 1. Performance Comparison of Energy Conserving Routing Algorithms 



Algorithm Computation Complexity 


Approximation Fairness 


MCF 


0((log 2 n + f logi + | )nT ap ) 


(l- £ )- 3 


Yes 


A-MAX [2] 0(i log" +e )nT S p) 


(i-aT z 


No 


BMF[16] 


0(n 6 \ogn) 


(i-aT 


Yes 



Suppose n sensor nodes are distributed in a two-dimensional plane, there is 
a sink node without power limitation in the network to collect data. We can 
implement the algorithm in the following three phases: topology information 
collection, energy conserving routing construction and mobility manipulation. 
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Topology Information Collection. Each node first broadcasts a HELLO 
message with an ID number to its neighbors, on receiving the HELLO messages 
each node constructs a neighbor node list and send it to the sink node, using 
the neighbor node lists the sink node can figure out the whole topology of the 
sensor network. 



Energy Conserving Routing Construction. After obtaining the network 
topology, the sink node will calculate the energy conserving routing information 
using the algorithm outlined in section 4, these routing information will be sent 
to each sensor node finally. 



Mobility Manipulation. Each node broadcasts the HELLO message periodi- 
cally, the interval between two broadcasts is determined by the mobility speed. 
When any node finds the topology is changed, it will send the new neighbor 
node list to the sink. The sink monitors the network topology change, when it 
exceeds a given threshold, the routing information will be updated. 

We show some numerical results of the algorithm by simulation. Suppose 
n = 50 sensor nodes are evenly distributed in a 50m x 50m square area, a sink 
node is placed in the center of the area, each node has a radio transmission range 
of 20m. The transmission power consumption is 1 e~ 4 J/byte, the initial energy 
of each node is 1 J. 




Fig. 1 . The effect of e on the performance of MCF algorithm 



First we observe the effect of parameter e on the algorithm performance. 
We set the data collection rate /j = 100 byte/s, 5 = (n/(l — e)) -1 / E and vary e 
from 0.1 to 0.5, we count the number of iterations performed in the simulation, 
the optimal results are computed by BMF algorithm. Figure 1 illustrates the 
relation between computation complexity and the approximation precision. We 
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can see as the percentage of the optimal increases, the number of iterations also 
increases, the percentage of the optimal is approximately a logarithmic function 
of the number of iterations. 

In figure 2 we set £ = 0.05, S = (n/( 1 — e)) _1 ' £ and vary the data collection 
rate /,; from 100 byte/s to 1 kbyte/s, the fairness index is defined as follows: 




where f[ is the actual data collection rate obtained by energy conserving routing 
algorithm. From figure 2 we can see as the data collection rate is low, both 
the MCF and A-MAX are fair, but when the data collection rate is high, the 
fairness index of A-MAX algorithm increases rapidly, which means there is severe 
unfairness among the sensor nodes. 




Data Collection Rate (byte/s) 



Fig. 2. Fairness comparison of MCF and A-MAX algorithm 



7 Conclusions 

We propose a fair approximation algorithm for energy conserving routing prob- 
lem in wireless sensor networks. The objective of our algorithm is to maximize 
the network lifetime under given data collection rate, we formulate the prob- 
lem as a maximum multi-commodity concurrent flow and develop an iterative 
approximation algorithm based on a revised shortest path scheme. Theoretic 
analysis shows our algorithm is a (1 — e) -3 approximation of the optimal solu- 
tion. To reduce the computation complexity, we also present some optimizations 
of the algorithm. We compare the performance with some other energy con- 
serving routing algorithms by theoretic analysis and simulation. We assume all 
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the sensor nodes are homogenous in this paper, in future we will extend our 

algorithm to the more general case. 
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Abstract. Home networks and the interconnection of home appliances 
is a classical theme in ubiquitous computing research. Security is a re- 
curring concern, but there is a lack of awareness of safety: preventing the 
computerized house from harming the inhabitants, even in a worst-case 
scenario where an unauthorized user gains remote control of the facilities. 
We address this safety issue at the middleware level by restricting the 
operations that can be performed on devices according to the physical 
location of the user initiating the request. Operations that pose a po- 
tential safety hazard can only be performed within a physical proximity 
that ensures safety. 

We use a declarative approach integrated with an IDL language to ex- 
press location-based restrictions on operations. This model has been im- 
plemented in a middleware for home audio-video devices, using infrared 
communication and a local-area network to implement location aware- 
ness. 



1 Introduction 

The idea of a computerized, “intelligent” home is with the emerging trend of 
ubiquitous computing starting to become reality. The advantages of such a home 
include easing household chores, providing entertainment, and saving energy by 
intelligently controlling the house temperature. Through the use of the Internet 
and mobile technology, many functions can even be controlled remotely. However, 
there is an obvious danger in giving computers control of the home, namely that 
operations that used to be carried out by a person aware of his environment now 
are invoked through a computer — the user only has limited, if any, awareness of 
the consequences of his actions. Moreover, if security is compromised, a “virtual 
intruder” can gain remote control over the functions of the home. 

The computerized installations of the home can and should be protected by 
appropriate security measures, e.g. passwords and encryption. However, pass- 
words alone do not provide the appropriate level of protection, since residents 
might inadvertently perform actions that harm other residents of the house, or 
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unwarily reveal their passwords if using them in public. To avoid compromising 
the safety of the home, an extra safety layer is needed which ensures that a set of 
basic safety rules are obeyed when controlling functions in the home from afar. 
This extra safety layer not only applies to remotely controlling the home, but 
also to safety-critical functions that are controlled from within the home: turning 
on the cooking stove in the kitchen from the living room could, for example, be 
a potential safety hazard. 

We propose that each function of every computerized home appliance by 
design is classified in terms of the maximal distance at which it can be controlled. 
For example, safety-critical functions of appliances may only be controllable by 
people who are present in the same room since they would only then be fully 
aware of the consequences of their actions. Embedding fixed safety constraints 
in each device ensures that safety is maintained regardless of the complexity of 
the system as a whole and despite any erroneous configuration by the habitants. 

Concretely, we are developing a safety-enabled middleware for home net- 
works. Safety concerns are expressed declaratively as access modifiers annotated 
on each operation in the software components of the system. At run-time, the 
middleware verifies that safety constraints are obeyed, in our implementation 
using a combination of infrared (IR) signals and a trusted local network. 

Contributions. The primary contributions of this work are as follows. First, we 
have defined a declarative, language-level approach to expressing safety concerns 
controlled by physical location. Second, we have implemented a complete solu- 
tion with Java language bindings based on infrared communication and a local 
network. Third, our IDL language extensions enable a novel form of dispatching, 
distance-based dispatching , where the receiver method is selected based on the 
physical distance between the caller and the callee. Last, our work serves to ex- 
plore an often-ignored aspect of ubiquitous computing, namely that of ensuring 
safety (as opposed to security): safety is about avoiding potentially dangerous 
situations from occurring, whereas security is about confidentiality, integrity, and 
availability. 

Background: Industrial Research Project with Bang & Olufsen. This 
work has been produced in the context of a research project in collaboration 
with the Danish audio-video (AV) systems producer Bang & Olufsen (B&O). 
B&O is a pioneer in home networking for AV devices: B&O products can be 
connected and share resources using the (proprietary) Masterlink network. This 
paper concerns safety issues that can be addressed at the middleware layer of a 
home network and hence apply globally throughout the entire system. 

Overview. The rest of this paper is organized as follows. First, Sect. 2 presents 
two safety-related case studies in the domain of home networking. Then, Sect. 3 
presents the principles behind our solution for ensuring safety, and Sect. 4 
presents our concrete implementation. Last, Sect. 5 presents related work, and 
Sect. 6 concludes. 
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(1) User powers on TV, it turns to default position 




(2) User places candlestick and goes into other room 
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(3) User powers off all devices from other room, tele- 
vision returns to initial position, candlestick falls over 



Fig. 1 . Bang & Olufsen scenario: remote operation of television with motorized foot. 



2 Case Studies in Home Networking 

2.1 Case 1: B&O AV Devices 

B&O televisions can be equipped with a motorized foot, which can turn the 
entire television 45 degrees to either side; this feature can be used to get a 
perfect viewing angle (see Fig. 1, left). Most features of the television can be 
controlled remotely through the Masterlink. The motorized foot can, however, 
only be controlled from within the same room, so that the user is aware of the 
consequences of turning the television. If this was not the case, a user controlling 
the television from a different room could inadvertently cause an accident, for 
example if a lit candlestick had been placed next to the television such that 
remotely operating the motorized foot would cause the candlestick to fall over 
and cause a fire (see Fig. 1, right). 

The simple restriction on the motorized foot is, however, non-trivial to imple- 
ment. First, other operations implicitly operate the motorized foot as well: e.g. 
powering off the television causes it to turn back to the neutral position. Any 
device can be used to turn off all other devices on the network, but the television 
should only return to neutral position if this command was issued from within 
the same room. Second, the restrictions are not part of the Masterlink protocol, 
but are implemented manually at the application layer of each device. 



2.2 Case 2: Internet-Controlled Home Appliances 

Home appliances can be controlled using protocols such as X10, UPnP, and 
Lon Works, optionally delegating outside communication to a generic gateway 
such as OSGI. For example, residents can remotely turn on the oven to cook 
the food such that it is ready when they return from work. Similarly, residents 
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who have left the house can remotely turn off the central heating to save energy 
or remotely open the front door to allow a friend to enter the house. However, 
controlling these appliances remotely may lead to unsafe situations. Turning on 
the oven is a potential fire hazard and turning off the central heating could cause 
water pipes to break due to frost. Moreover, a thief who gains remote control of 
the house could open the front door at night while the inhabitants were asleep. 



2.3 Analysis 

The key observation in both case studies is that remotely controlling certain 
home appliances from “too far away” is a potential safety hazard. To ensure 
the safety of the inhabitants of the house, this remote control needs to be re- 
stricted somehow regardless of what security measures are put in place. Indeed, 
a habitant who unwarily tries to perform an operation that is unsafe for other 
habitants should be prevented from doing so. Furthermore, an unauthorized 
person can gain access to the functionality of the house, if the password of a 
habitant is somehow revealed; this should not result in an unsafe situation for 
the inhabitants of the house. 

In our case studies we have identified three levels of “nearness” required for 
safe operation: 

present. Close enough that the user is aware of the consequences of his action, 
which we define to be that the operation was initiated within the same room. 
Movement of a part of a device (e.g., turning a television, ejecting a VCR 
tape, or opening a CD or DVD tray) is only safe when the user is directly 
aware of the consequences of his action, as is also the case for operations 
that can start a fire (e.g., turning on a toaster, oven, or cooking stove), 
local. Close enough that the user can be trusted not to do damage to the home, 
which we define to be that the operation was initiated within the home. 
Unlocking the front door or turning off the central heating is a potential 
hazard to the integrity of the house and is only safe when the user is inside 
the house (we assume that people who are present in the house are trusted) . 
Turning on devices that may overheat or shock residents at home when 
turned on also requires the user to be nearby, 
global. Anywhere, which in practice means anywhere with access to the In- 
ternet. Several operations are always safe to perform; this includes turning 
off devices (unless they move when they are turned off) and turning up 
and down the heat (within a normal temperature range). Considering only 
safety issues these operations may be performed from any location, although 
subject to security-based access control. 

Ultimately, balancing safety guarantees with usability must be done by the man- 
ufacturer; we here describe only the safety concerns. The reification of the iden- 
tified location categories as language-level constructs in a middleware for home 
networking is the subject of the next section. 
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3 Distance-Based Access Modifiers 

3.1 Considerations 

Executing an operation may involve invoking operations defined by software 
components located in other devices. Safety-critical operations should only be 
executed if the user who initiated the operation by operating a device is located 
physically close to the receiving device — how close depends on the specification 
of the operation. Determining the location of a user cannot in general be done in 
a completely reliable way, and hence the degree of safety offered by the system 
is limited by the technology used for determining the location. 



3.2 Conceptual Model 

Conceptually, our location model has three domain-specific distances: present, 
local and global. When seen as a relation between devices each of these are 
reflexive, transitive and symmetric. Furthermore, two devices that are local to 
each other are also present to each other, and likewise for global and local. This 
results in a hierarchy of zones. Each device is a member of a present zone, a room, 
possibly together with other devices. A local zone, a house , encloses a number of 
non-overlapping present zones. Likewise, the global zone, the Internet, encloses 
a number of non-overlapping local zones. 

Our three location categories and their property of hierarchical nesting gives 
a very simple location model with relative symbolic locations. In spite of its 
simplicity, the model covers all safety issues found in the scenario analysis. The 
location model is easy to understand because of the mapping to concepts as 
room and house. This is paramount when safety restrictions may be viewable at 
the user level. Furthermore, the model does not require knowledge of absolute 
positions of devices in the home, which would require either manual configu- 
ration or support by an advanced location system. Nonetheless, locations from 
a location system using physical positions or most kinds of symbolic locations 
could be translated to the location zones by use of extra information of the 
infrastructure. 

The origin of a call is the device the user manipulated to initiate the (possibly 
distributed) sequence of operation invocations. When an operation is invoked at 
run-time, the call is annotated with the identity of the origin of the call, as illus- 
trated in Fig. 2. This annotation, called an origin annotation, is propagated to 
callee operations. Before executing a restricted operation, the annotation is used 
to verify both that the call did indeed originate from the origin device and that 
it is currently in the allowed area. How the origin is queried is implementation- 
specific (see discussion in Sect. 4), but it must be done in a way that does not 
compromise the safety guarantees provided by the implementation. 

Mobile devices may require stricter verification depending on how location 
awareness is implemented, for which reason we require the origin annotation to 
be flagged “mobile” when the device should be considered mobile. 
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Fig. 2. Call annotations and verification in a call sequence 



3.3 Language Constructs 

To enforce distance-based restrictions on an operation of a device, the program- 
mer can annotate the operation with a distance-based access modifier which 
indicates the zone from which the operation can be invoked. The modifier is 
similar to “private” and “public” as known from object-oriented languages, but 
is used to dynamically determine if a method can be called based on the distance 
from which it was invoked. 

The annotations are incorporated into our IDL language, which allows the 
programmer to declaratively specify safety considerations in the interface. Nev- 
ertheless, since the access modifiers cannot be tuned dynamically depending on 
the usage context, they should only be used for operations where an absolute 
minimal safety level is needed. Allowing the residents to tune the safety restric- 
tions would compromise the system; tuneable safety restrictions must thus be 
handled at the application level. 



Home Network Scenario. The IDL is extended with one distance-based ac- 
cess modifier for each zone, so operations can be annotated as present, local, 
or global. As an example, the IDL for a simplified B&O Avant television with 
safety modifiers is shown in Fig. 3. The Avant television is composed of a screen, 
a motorized foot, and an integrated video cassette recorder. Almost all opera- 
tions in the television have the modifier global , as using them does not incur a 
safety risk. One exception is the method powerOn on the screen which is limited 
to local callers because there is a small risk that a device which is turned on can 
overheat or may shock or confuse residents at home. The methods turn in the 
motorized foot and eject in the VCR can only be invoked by a present caller 
because physical movement is a safety issue. Nonetheless, the method powerOff 
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module com.beo . avant { 

interface Screen { 

local void powerOnO ; 

global void powerOffQ; 

global boolean getPowerStatus () ; 



interface VCR { 

global void play (in int speed) ; 
global void stopO; 
present void ejectQ; 



interface MotorizedFoot { 

present void turn(in short degrees) ; 
global short getPositionO ; 

}; 



>; 



Fig. 3. Interface for the Avant television in the safety modifier extended IDL 



which has global access calls turn in the motorized foot to turn the television 
into the neutral position; the middleware ensures that the success of this indi- 
rect call depends on the location of the caller of powerOff. Thus, the present 
modifier is not needed on the method powerOff because the operation in the 
motorized foot defines its safety restrictions itself. 1 



3.4 Distance-Based Dispatching 

We observe that it is often the case that two different variants of an operation 
exist, a more and a less privileged version, to be executed depending on the 
origin of the caller. For example, the method eject in the VCR should not eject 
but only stop the tape if the user is not present in the same room. This behavior 
is not an exceptional case but a part of the functionality of the VCR. 

To make such operation variants explicit, we allow overloading by access 
modifiers in the IDL. As an example, we can redeclare the method eject of the 
component VCR (from Fig. 3), as follows: 

present void ejectO; // stop and eject the videotape 
global void ejectO; // stop and flash LED to indicate error 

This method now has both a present and a global variant. Resolution of what 
method to call takes place at run-time, depending on the origin of the call. 
Our IDL language extensions thus enable distance-based dispatching : the re- 
ceiver method is selected based on which location zone the caller is located in 
regarding the receiver. This approach can be seen as a specialized form of predi- 
cate dispatching, where the only predicates are membership of different location 
zones [2]. 

1 Ultimately, the manufacturer must balance safety guarantees with usability; we here 
describe how we would design the interface of the television. 
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The advantages of distance-based dispatching include that the conditionals 
and exception handlers that were otherwise needed to handle distance-based 
variants of operations are eliminated from the code, and that the code for each 
special case is clearly separated. The approach incorporates a version of the 
design pattern strategy into the language [4] . Instead of using different classes to 
implement various implementations of a method, the methods are implemented 
in the same class and the middleware determines which of them is the appropriate 
to call. 



3.5 Java Language Mapping 

An IDL interface is mapped to two Java interfaces and two Java classes: a client- 
side interface, a server-side interface, a proxy class, and a dispatcher. The client- 
side interface is the outside view of the device. The server-side interface extends 
the client-side interface, and in addition contains a method for each variant of a 
method used for distance-based dispatching. Thus, one method on the client-side 
may correspond to several methods on the server-side; the implementation uses 
a simple naming scheme to allow declaration of methods overloaded by distance. 
The proxy implements remote method calls. The dispatcher checks the safety 
restrictions and dispatches according to the distance before it calls methods on 
the remote object. In combination these objects hide the details of the network 
communication and safety verification. 

All generated methods have an origin annotation — an extra parameter for 
holding the identification of the origin. The dispatcher verifies that the origin 
device currently is in the required zone. If verification fails, the runtime exception 
SafetyException is thrown, which can be handled by the caller. Depending on 
the situation, the caller may choose to silently ignore the failure, or may choose 
to propagate the error condition all the way up to the user interface level (e.g., 
by informing the user that the operation was not permitted). When there are 
several anticipated variants of an operation, distance-based dispatching can be 
used to avoid triggering a SafetyException. 



3.6 Assessment 

Our approach allows safety concerns to be declared at the interface design level. 
Thus, implementing distance-based safety features on a device is done simply 
by adding a modifier to an operation, which is significantly simpler than manu- 
ally implementing similar functionality. However, flexibility is reduced since the 
safety restrictions are fixed in the device. Alternatively, the safety restrictions 
could have been implemented with a policy language combined with policy en- 
forcement [7]. The enforcement would require communication with a security 
unit to grant a short term permission to access an operation, as the access 
rights changes when devices move. A policy-based solution would allow various 
safety levels or dynamic changes to the safety levels, and using a policy-based 
language gives a separation of concerns. On the other hand, integrating safety 
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concerns into the device software (as in our solution) makes safety consider- 
ations an unavoidable part of the design process instead of allowing it to be 
postponed. Moreover, the device manufacturer has better control over safety 
restrictions, and expressing distance-based concerns at the language level en- 
ables the programmer to use distance-based dispatching to concisely implement 
distance-dependent functionality. 

4 Implementation 

4.1 Home Network Hardware: IR and Local Network 

We have chosen infrared (IR) and a local network as technologies to support lo- 
cation awareness in our implementation. For a detailed survey of location aware- 
ness technologies, we refer to Hightower and Borriello [6] . IR signals can be used 
to detect whether a device is present with regards to another device, since they 
tend to be diffused throughout an entire room, but do not traverse walls [9]. We 
classify devices as being either stationary or mobile, which allows us to use a 
home network to determine whether a device is local to the home. If a stationary 
device is on the local network, it is in the home. Similarly, if a mobile device is 
on the local network and can contact a stationary device via IR signals, it is in 
the home. Additionally we use the concept of IR proximity groups: Two devices 
are in the same group if they are transitively within mutual IR range. 

To protect the local network from outside intrusion, strong encryption is 
needed to ensure secure communication between the trusted devices on the home 
network. As argued in Sect. 2, security based e.g. on passwords is unsafe because 
the users can compromise the safety of the system by revealing the password. 
We assume that a secure network can be established without risk of the user 
compromising the network e.g. by revealing passwords. For example, the user 
could be required to manually carry an encryption key stored on an IR-enabled 
device between trusted devices. 

In the concrete setup stationary devices (including PCs) are connected by 
the local network. 2 Mobile devices use Wireless LAN communication to a base 
station connected to the local network. All devices are equipped with IR senders 
and receivers based on the IrDA standard. We currently assume that the mid- 
dleware in all devices can be trusted. 

Run-time verification of the local origin annotation for mobile devices is 
illustrated in Figure 4. The IR proximity group mechanism updates proximity 
information on a regular basis, so the stereo is aware that the remote control is 
within proximity range. The remote control calls a local method on the television 
using a mobile origin annotation. To verify that the remote control is local , the 
television queries all devices on the local network if one of them is in proximity 
group with the remote control. The stereo confirms the group membership, so 
the operation is allowed to execute. 

2 All devices are currently simulated using stationary and portable PCs; porting the 
system to an embedded platform is future work. 
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Fig. 4. Run-time verification of mobile, local origin annotation 



4.2 Home Network Middleware: Bisu 

As described in the introduction, we are developing a middleware for home 
networks in collaboration with B&O. This middleware, named Bisu 3 , constitutes 
the middle layer of a complete system architecture for networked AV devices 
implemented in Java. 

Bisu interfaces are declared using the IDL language described in Sect. 3.3. 
The IDL compiler (described in Sect. 3.5) generates appropriate interface, proxy, 
and dispatcher code for each component. The origin annotation designates which 
device is the origin of a call, and is passed between devices from caller to callee. 
The creation of an origin annotation is done using an API where the programmer 
also indicates whether the device is currently considered mobile (see Sect. 3.2). 
The safety restrictions require a user to be present or local. So, the genera- 
tion of origin annotations must be controlled to ensure that they always iden- 
tify interaction with a user. Therefore, the ability to act as origin is limited to 
the methods included in the origin specification which is written by the pro- 
grammer. We use a static verifier, implemented using the IBM Jikes Bytecode 
Toolkit (www.alphaworks.ibm.com/tech/jikesbt), to ensure that only meth- 
ods pointed out in the origin specification create origin annotations. 

4.3 Experimental Validation 

To experimentally validate the operation of our system, we have used a setup 
where PCs act as home AV-devices. A tablet PC acts as an advanced mobile 
remote control that interacts with two stationary PCs connected using the local 
network. We do not encrypt messages at the moment as it is not necessary for 

3 Bisu is named after the Egyptian household deity (also called Bes) believed to guard 
against evil spirits and misfortune. 
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the validity of the experiment. One PC acts as a television with a motorized foot, 
implemented using the IDL declaration described in the example of Sect. 3.3, 
the other PC as a generic AV device. 

As expected, local calls originating in the tablet PC can be made only when 
there is IR contact between the tablet PC and one of the PCs, and present calls 
can only be made when the tablet PC or the PC acting as destination can see 
IR signals from the other, sometimes using the third PC as intermediate device. 
The effective range of the remote control is limited to roughly 2.5m with our 
hardware (although different IrDA hardware exhibits different range character- 
istics). There is no significant call-time overhead due to the safety verification, 
since all verification can be done locally based on the safety annotation which 
was passed as an extra parameter. 

5 Related Work 

Context awareness is a key topic in ubiquitous computing, as it allows comput- 
ers to take the current usage context into account [3]; it has a huge variety of 
different uses. Proximity-based login is probably what resembles our work the 
most. Here, the basic approach is that the user is automatically logged into a 
computer when physically located within a given perimeter [1,5, 8, 9]. Similarly, 
we also use context awareness to restrict the operations that can be executed on 
a computer according to the physical location of the user who initiated the oper- 
ation. However, unlike these other approaches where restrictions are essentially 
implemented manually, we reify the restrictions at the programming language 
level to be part of the interface for each device, and we allow a fine-grained 
(per-operation) approach to controlling access. 

Home network technologies such as X10, UPnP, Lon Works, and OSGI were 
briefly discussed in Sect. 2. In all cases safety is enforced through security. There 
is no backup mechanism in case an intruder gains access to the home network 
or if residents are not aware of unwanted consequences of their actions. Our 
approach on the other hand is designed to prevent unsafe actions from being 
executed when the user has access to the home network -rightfully or not. 

Policy languages were mentioned in Sect. 3.6 as another means to express 
distance-based restrictions. Policy languages are capable of expressing compli- 
cated combinations of e.g. rights and prohibitions. Policy-based security that 
uses access control have been implemented for a pervasive environment [7]. 

6 Conclusion and Future Work 

Ubiquitous computing is an emerging trend not only in computer science but also 
in everyday life. Nonetheless, programming language support for central issues 
such as context awareness have not been widely explored, and little attention 
has thus far been devoted to basic concerns such as safety. We have presented a 
novel approach based on an extended IDL language that helps the programmer 
to declaratively express location awareness, and we have applied this approach to 
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improving the safety of a home network for AV devices currently being developed 
in collaboration with an industrial partner. Our current implementation, which 
is based on IR communication and a trusted local network, enforces basic safety 
concerns, in principle making it impossible for an intruder to compromise safety 
without physically entering the home. 

In terms of future work, we are primarily interested in further exploring 
programming language aspects of location awareness. Languages such as C# 
and the newest version of Java integrate metadata, which could be used to 
annotate location-based access modifiers directly onto methods without the use 
of a separate IDL (language independence would of course be sacrificed). 

In closing, we note that there is a tension in ubiquitous computing between 
usability and security. For example, the safety mechanisms integrated in Bisu 
could be used to automate security procedures, like proximity-based login. We 
are interested in exploring to what extent incorporating safety awareness into 
applications can free the user from annoying security procedures. 
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Abstract. The purpose of AmbieSense is to provide personalised, context- 
sensitive information to the mobile user. It is about augmenting digital 
information to physical objects, rooms, and areas. The aim is to provide 
relevant information to the right user and situation. Digital content is distributed 
from the surroundings and onto your mobile phone. An ambient information 
environment is provided by a combination of context tag technology, a software 
platform to manage and deliver the information, and personal computing 
devices to which the information is served. This paper describes how the 
AmbieSense reference architecture has been defined and used in order to 
deliver information to the mobile citizen at the right time, place and situation. 
Information is provided via specialist content providers. The application area 
addresses the information needs of travellers and tourists. 



1 Introduction to AmbieSense 

AmbieSense addresses ambient, personalised, and context-sensitive information 
systems for mobile users. The overall goal of such systems is to help achieve the 
digital, ambient environments that make user’s information-related tasks easier by 
adapting to user’s context and personal requirements. Our approach to solve this is 
illustrated in Figure 1 below. The figure illustrates the AmbieSense reference 
architecture at an overall level. It can be used to build various digital information 
channels for mobile users. The objective is to provide the correct information to the 
right situation of the mobile user. The figure depicts three central comer stones of the 
system: Content Service Providers (CSP), context tags and mobile/travelling users. 

Information or content is provided by Content Service Providers, offering net- 
based, digital information services to their customers. This is currently achieved by 
direct communication between the information consumers (i.e. the mobile users) and 
the CSP. A key objective of CSPs is to increase the value of their services by 
increasing the reach, relevance, and accuracy of information provided to the 
consumer. 
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Fig. 1. The AmbieSense overall reference architecture 




AmbieSense has designed and implemented context tags (see Figure 2) as part of 
the project. Context tags are miniaturized, wireless computers placed at strategic 
points in the user's environment that can relay content from the CSP, prioritised with 
context information, to the mobile device. The context tags have computing 
capabilities which enable different software applications to run on them. Context tags 
differ from other disappearing computers because one can exploit contextual 
information about both the tag 
environment and the user in 
range to provide relevant 
content. We have designed a 
product family of context tags 
to fulfil different application 
needs within ambient 
computing. The tags can differ 
with respect to storage 
capacity, network communi- 
cation (i.e. Bluetooth, WLAN, 
and Ethernet), computing 
speed, and programming 
possibilities. They are based 
on Linux operating system, 
and can have none or several 
of the Ambie-Sense software 
components running on them - 
depending on the application 
needs (see Figure 3 and 4). 

Mobile users are the consumers of information services provided by the system. 
We assume that they use some kind of mobile computer, e.g. a PDA or a smart phone 



Fig. 2. The AmbieSense context tags. They commu- 
nicate via Bluetooth to handheld devices nearby, and 
are 12 cm in diameter. (Hardware and design by 
SINTEF ICT) 
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to interact with these services. A central idea is to associate information with objects 
in the surroundings. This can be through seamless access to content when people are 
near tags or by creating an environment that stimulates the user’s curiosity and 
encourages him to look for information in the surroundings. AmbieSense applications 
can identify content for an individual by exploiting contextual knowledge about the 
user. 

In summary, AmbieSense seeks to address the requirements of ambient content 
services provision and mobile users by improving the reach, accuracy and timeliness 
of content delivered to the mobile user. Each application can have different system 
architectures instantiated from the AmbieSense reference architecture. The next 
sections explain how this is achieved. 



1.1 Infrastructure and Framework 

The AmbieSense framework can run on different infrastructures that enable mobile 
users to access digital content. 

Infrastructure. AmbieSense can run on a range of infrastructure technologies, 
including wireless communications, hand-held computers, PDAs, smart phones, 
information and application servers. Each technology has been targeted because they 
can provide important, value-adding functionality to the framework in terms of 
integration with both new and existing systems. 

AmbieSense Framework. One of the main technical outcomes of the project, are 
the context tags (hardware) together with the technical platform (software). The tags 
and the technical platform are both meant to be key mechanisms for ambient content 
services and applications, with the purpose to build different applications. The main 
architectural components of the AmbieSense Framework are depicted in Figure 3 and 
4. For instance, the underlying Content Integration Platform (CIP) provides the 
ambient content management functionality to an application. It enables ambient 
access to content from mobile devices with limited storage. One important part of the 
CIP is a search engine that can run on mobile phones, context tags, and on content 
servers. Another component is the context middleware, which supports the context- 
aware applications. It enables applications to store, update, and retrieve contexts and 
their structural templates (in the project, we have had our own context structures, but 
the middleware supports the development and use of others too). 



2 The AmbieSense Reference Architecture 

The AmbieSense reference architecture is documented using principles from the RM- 
ODP (Reference Model - Open Distributed Processing) method [1], This method was 
selected mainly because it is an established standard. It comes with five predefined 
viewpoints: enterprise, information, computational, engineering, and technological 
viewpoints. Each viewpoint helps to visualise the range of different aspects that need 
to be addressed when constructing architectures. 
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In this paper, we describe the following three viewpoints 1 : 

1. Enterprise viewpoint - focusing on the overall functionality of the system 
towards its users and stakeholders, described in terns of user roles and 
actors. 

2. Computational viewpoint - focusing on the high-level component 
descriptions and interactions between these components. The computational 
viewpoint is documented using diagrams showing the functional 
decomposition into components and their interfaces. 

3. Engineering viewpoint — focusing on how the concrete configurations of the 
reference architecture can be deployed to real systems. The engineering 
viewpoint is documented using deployment diagrams showing 
communication and distribution aspects of system components. It also 
addresses performance and capacity issues by suggesting how these quality 
requirements can be satisfied by the different configurations. 



2.1 Enterprise Viewpoint 

Content: Content is the information that flows in the AmbieSense system. From the 
user's point of view, content is any data that the user can receive, either manually by 
requesting for it, or automatically in respect of some settings. Usually, content is 
temporarily stored in the form of audio, movie, image, text, or mark-up files before it 
is presented for the user. 

Context: Context can be defined as a description of the aspects of a situation [2], The 
period of a context can range from being a very short moment to many years. The 
current context can depend on several criteria, e.g. the location, mental state, etc. The 
user’s current context can have a direct influence on the functionality of the user 
application (e.g. the application can present relevant information to the user or choose 
not to present some information based on the current context). One may speak of 
several types of contexts, depending on your application and your needs for info - and 
of your view of what info/data is important to capture/de scribe a situation. 

In AmbieSense, context technology is a mechanism that can capture contexts 
structures, and links between contexts and content. However, we argue that there 
should be some common structure for user contexts, which is easy to reuse across 
domains (such as different applications). What makes domains differ is mainly that 
the relevance and importance of attributes within the context structure differ. 
Redundant attributes may exist in the context as their relevance can change over time. 

Users: The users are the consumers of content that a CSP provides. They may also 
indirectly be consumers of contexts. A user will be able to receive information 
according to the current context. 



1 Note that the technological viewpoint is not addressed by the AmbieSense reference 
architecture, because it is platform and implementation neutral. Technological aspects should 
be addressed at the time of derivation of a specific system, when detailed technological 
requirements become available. Also note that the information viewpoint exists in technical 
project reports and is too large to be covered within the scope of this paper. 
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Context Tags: AmbieSense has developed new, miniaturized, wireless computers 
called context tags. A context tag is an entity that enables the binding of a location to 
a context (or a set of contexts), and content. The context tag is realised as an 
embedded computer with a Bluetooth interface for communication. The 
communication facility is used for the exchange of software, content and context. 

In its simplest form, a context tag only gives a reference to a location to be used by 
a mobile user. More advanced context tags enable other services. For instance, a 
context tag can have a web server, a search engine, and other software installed, 
available for mobile devices nearby to use. 

2.2 Computational Viewpoint 

The computational viewpoint is concerned 
with the functional decomposition of the 
system into components and the interaction 
patterns between the components (services) 
of the system, described through their 
interfaces. 

Overview. The AmbieSense Reference 
Architecture consists of a set of main 
components, described below following a 
top-down sequence. Figure 3 illustrates the 
layers and the organisation of the architecture 
pattern. The light grey components are part 
of the AmbieSense Framework; the darker 
grey components are part of each 
application/solution developed on top of the 
framework. 

By layering, we assume that components 
form a stack that prescribes how components 
interact with each other. For example, in 
Fig. 2 the user interface is illustrated on top of 
applications and agents. This implies that the 
user interface uses the services offered by applications and agents. Likewise, 
applications and agents use the services from the push and pull components. In some 
system architectures, one or more of the indicated layers may not be present, thus the 
reference architecture allows interaction between non-adjacent layers. However, if the 
layer is present, then the layering principle should be adhered to. 

• User Interface. The user interface enables the user’s interaction with the 
system. The design of the user interface is based upon the requirements for 
the particular application. The user interface uses the services offered by the 
developed application and/or agents 



User interface 



Applications Agents 



Push 



Pull 



Context 

Middleware 



CIP 




Fig. 3. Functional decomposition of 
AmbieSense architecture 
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• Applications and Agents. Applications and agents 2 are developed according 
to the needs of each specific solution. Developers may choose whether to 
exploit agents to provide a solution. Applications (and agents) use the 
services of the layers below, primarily the push and pull services, but also 
services from the context middleware and CIP when appropriate. 

• The Push and Pull components implement functionality for content 
distribution, supporting the two different principles push and pull. The Push 
and Pull components use the context management functionality of the 
context middleware together with the content provisioning capabilities of the 
CIP to enable context-aware content distribution. 

• Context Middleware. The context middleware is responsible for context 
management (context-storage, -retrieval, and -matching functions). The 
context middleware supports additional functionality such as linking content 
to contexts. 

• CIP. The CIP (Content Integration Platform) is a composite service 
component that deals with content management in terms of capturing, 
inclusion, integration, and distribution of content to users. It adds 
functionality for personalisation and customisation - all within an integrated 
platform. The CIP provides a single interface to heterogeneous content bases 
underneath while adding useful functionality for caching, and application 
protocols in an integrated environment. CIP Light is a minimized version of 
the CIP designed for mobile devices. A mobile device does not need the 
same functionality as does a CSP server, and the distinction between CIP and 
CIP Light reflects this. 

• Network Services. Because AmbieSense is inherently distributed, networking 
capabilities are used between the platforms (i.e. mobile computer, context 
tag, and CSP platform). Networking capabilities are provided by industry 
standard technologies, such as Bluetooth, WLAN, and GPRS. 

• Proximity Detection. In addition, a subcomponent called proximity detection 
will enable detection of mobile computers in the vicinity of the context tag. 



2.3 Engineering Viewpoint 

The engineering viewpoint is concerned with the design of distribution-oriented 
aspects, i.e. how components are physically deployed to the different machines and 
infrastructure required to support distribution. The engineering viewpoint specifies a 
networked computing infrastructure that supports the system structure defined in the 
computational specification and provides the distribution transparencies that it 
defines. 

It is important to note that although we discuss the context tag in all configurations, 
it is possible to implement the AmbieSense Reference Architecture without them. 
They might be omitted or other style computers with similar capabilities can be used 



2 In multi-agent systems, agents are programs that act in a self-interested manner in their 
dealings with numerous other agents inside a computer. This an'angement can mimic almost 
any interactive system: a stock market; a habitat; even a business supply-chain. 
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instead. For example, wireless access points and other embedded computers can be 
used instead of the tags. However, they should be re-programmable for the 
application developer. Hence, a context tag is an open hardware platform. 
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Fig. 4. Reference Architecture Platform 

Figure 4 illustrates how the various functional components are organised logically 
in the reference architecture. It does not show a concrete implementation of it. Rather, 
it shows how each of the three platforms (mobile computer, context tag and content 
service provision platform) is able to function as nearly independent execution 
environments. Now this is not, however, a very likely scenario in practice. It is merely 
a demonstration of the flexibility of the reference architecture to support widely 
different implementations dictated by the application and user requirements. To 
illustrate this, three examples of concrete system architectures based upon the 
reference architecture are found below (see Figure 5 for deployment diagrams of the 
functional components). 



Example 1: Thin client, rich Context Tag configuration. Certain solutions 
require low complexity mobile computer configurations. For example, if a solution 
was going to support the use of smart phones with very limited storage and processing 
capacities, most of the processing should be allocated to the Context Tag or the 
Content Service Provision platforms (see Example 1 in Figure 5). 

Example 2: Medium-weight Mobile Computer, rich Context Tag 

configuration. In scenarios where more generic and sophisticated context processing 
such as context matching is required, the Context Middleware can be deployed also to 
the Mobile Computer. Naturally, this assumes more processing power and storage 
available on the mobile device. Example 2 in Figure 5 illustrates the deployment 
scenario. In this example the application can pull content from the context tag. The 
pull can be based upon the current context on the mobile computer, and the content 
related to it or other similar contexts, may be delivered from the tag and to the user. 

Example 3: Rich Mobile Computer, medium weight Context Tag, medium 
weight Content Service Provision platform. The previous configurations have not 
included the Content Service Provision platform. For many technical solutions, the 
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integration of content from existing content management systems is critical. This may 
be a good reason to consider configurations that include a content server that can 
provide the link to existing legacy systems. The CIP component addresses the task of 
integration and a vast number of other sophisticated processes. 

The project has implemented several system architectures from the reference 
architecture. One example application is the agent-based system that was developed 
and tested for Oslo International Airport. The system architecture is exactly the same 
as depicted in Example 3 in Figure 5, with Oslo airport's own content management 
system, WebCentral 2000, incorporated on the CSP-side. Some screenshots of the 




Example 3: Rich Mobile Computer, 
medium weight Context Tag, medium 
weight Content Service Provision platform 



Example 2: Medium-weight Mobile 
Computer, rich Context Tag configuration 



Legend: 

| | Application specific component 

□ AmbieSense framework 



Fig. 5. Example system architectures instantiated from reference architecture 
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Oslo airport application is found in Figure 6. 

Even with such a rich client on the PDA, the client application responds with 
immediate recommendations to the user once the user context (of which preferences 
are one part) was changed. The agents provide personal recommendations to the user 
based upon the user contexts (i.e. preferences, flight, and location within airport). The 
content offered is the same as all travellers get at the airport, including special offers, 
shopping, flights, and transportation. The user is notified by changes in 
recommendations by a blinking tip button. This happens either when the user changes 
the personal preferences (i.e. user context), or when the user is nearby a context tag 
(i.e. user context is enriched by the context tag) 




Fig. 6. The Oslo airport application based on agents. A) My flight info via WLAN, B) User 
preferences/ context, C) Agent recommendations based on the current user context. 



3 Applications and Agents 

Applications in the AmbieSense Reference Architecture are developed to implement 
the business logic required to serve the needs of a technical solution. This application 
logic acts as a mediator between communication equipment, storages, middleware 
components and the user interface components. 

Applications and agents in AmbieSense can be compared with the Controller in the 
Model-View-Controller paradigm. Hence, they embed the application logic of the 
system. They can reside on the context tags, the mobile computers, or as mediators 
towards the CIP located in the network. Agents can be said to accomplish a range of 
tasks for the user. They can be defined as self-contained, autonomous pieces of 
software. An AmbieSense application can be implemented using agents. 
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3.1 The AmbieSense Multi-Agent System and Agent Types 

Within AmbieSense, part of the motivation for using agent systems is based upon 
non-functional requirements and lies in the fact that the complexity of the interaction 
of the system's users with their environment, including the sheer diversity of usage 
domains and contexts, requires a modular, component-based approach. The 
requirement for posing context sensitive requests to a distributed system implied the 
use of a multi-agent system architecture. Additionally, the non-functional require- 
ments specified for the AmbieSense system, including the requirement to scalability 
and extendibility into new domains with new users and new contexts, made obvious 
the need for a MAS-like solution. 

The AmbieSense Multi-Agent System (A-MAS) uses JADE/LEAP 3 , an existing 
multi-agent framework, for the protocol and communication. The agents implemented 
within the JADE/LEAP framework provide the generic core functionality for 
handling user contexts and triggering content queries. The JADE/LEAP platform was 
chosen as a result of a survey of state-of-the-art systems and an evaluation process of 
agent programming. On top of the JADE/LEAP framework, the A-MAS integrate 
with intelligent components for context-based content retrieval. A method that uses 
Case-based reasoning for context recognition, as well as a semantic web approach to 
content classification enables the retrieval of relevant content to the situation. 

The AmbieSense Multi-Agent System combines agent, context and content 
technology together. There are four agent categories, which are derived from the 
JADE/LEAP framework and thus benefit from that infrastructure 4 : 

• Content Agents provide low-level content dependent functionality by 
interfacing directly with CIP and the underlying content providers. 

• Context Agents are the principal agents of the A-MAS. They interact with the 
context middleware and administer the access to the user’s context space 
while ensuring privacy and security. The context agent updates and 
maintains the user context and triggers the queries for content conducted by 
the content agents. The context agent will forward this context to content 
agents and integration agents that will convert the context to a proprietary 
query fit for the respective IR and content modules. The context agent keeps 
track of the available search types/modules and will always forward the 
context to each of these modules. 

• Recommender Agents use contextual information and employ reasoning 
techniques for an analysis of the users’ situation in order to provide 
appropriate content. 

• Integration Agents are a kind of wrapper that provides interaction 
capabilities between the A-MAS and external (non-AmbieSense) 
components. 

The four A-MAS agent categories in turn use services from other components, 
such as the push / pull, CIP, and context middleware. Additionally, agent-internal 
knowledge representation such as ontologies and contexts may be used by the 



3 [Bellifemine 2000], http://jade.cselt.it 

4 JADE/LEAP is compliant to FIPA, Foundation for Intelligent Physical Agents 
(http://www.fipa.org) 
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recommender agent to enhance the relevance of retrieved content. The engineering 
viewpoint illustrates how these components may be deployed to the different 
platforms. The A-MAS agents, however, reside only on the mobile computer or on 
the context tag in order to ensure quick and secure processing of the user’s context. 



4 Related Work 

Related work can be found in many areas, but we will only focus on work related to 
the area of ambient and context-aware computing, because this is the main motivation 
for our work. 

Recently there has been much discussion about the meaning and definition of 
context and context-awareness. These are exemplified strongly in three recent 
workshops: DARPA [3], UM2001 [4], and SIG1R [5]. Context information may in 
general be exploited by any information service in order to improve it. Three 
important aspects of context can for instance be where you are, whom you are with, 
and what resources are nearby you. This information will often change for the mobile 
user. 

According to the definition given within [6] “A system is context-aware if it uses 
context to provide relevant information and/or services to the user, where relevancy 
depends on the user’s task”. The concept of context is not yet well understood or 
defined, and there exists no commonly accepted system that supports the acquisition, 
manipulation and exploitation of context information. 

The importance of context has also more recently been discussed for information 
retrieval systems. Contextual information provides an important basis for identifying 
and understanding users' information needs. Cool and Spink in a special issue on 
Context in Information Retrieval [7] provide an overview of the different levels in 
which context for information retrieval interest exists. Within the information 
retrieval field, related previous work [8] argued that a user's information needs all 
happen within a particular context and that context information can in general be used 
to improve information systems for users. 

Related work can also be found in the fields of ubiquitous and context-aware 
computing [9]. Dey et al, [10] in a special issue on Situated Interaction and Context- 
aware computing provide an overview. The focus from this perspective, however, has 
tended to be on location-based approaches and device contexts. Examples of these can 
also be found in few applications for tourists. 

Currently, no standard method for developing context-aware systems exists. The 
approach taken in AmbieSense is to unify research and work on user modelling with 
that of context-aware applications. We believe this approach is fruitful in order to 
create context-sensitive information systems for a large set of diverse user groups in 
the future. Most other approaches have used context either as means to adapt 
software, devices, and network communication, or to analyse linguistic aspects of 
human input to the information system. 
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5 Conclusions 

The use of user context in ambient computing is needed for several reasons: users are 
increasingly mobile and require ambient computing with context-aware applications; 
and they need personalised information services to help them in their tasks and needs. 

The challenge which ambient computing applications will face is complex. It 
cannot be solved easily with the current isolated approaches to wireless technology, 
miniaturised devices, context-aware applications, information retrieval, or user 
modelling. Instead, an integrated approach is needed where user focus is combined 
with effective information management in order to achieve ambient intelligence. 

The AmbieSense project has specified a reference architecture for ambient, 
context-aware information environments. It has been implemented in several system 
architectures - one of these was briefly presented in this paper. Through these 
applications, we have tried out variations of the architecture. The clients on the 
mobile devices have varied from thin clients to rich clients. The context tags have 
been used with Bluetooth, WLAN, and GPRS. The software and content deployed on 
the tags has also varied from system to system. This is also the case for the CSP. 

User tests have been conducted in both indoor and outdoor environments. The 
most recent test involved 75 test users at Oslo Airport during summer 2004. In 
general, context-aware information delivery is well accepted by the test users. Further 
work and analysis is being carried out on the test results. 
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Abstract. Ambient computing as an ideal demands levels of functional attain- 
ment that have thus far not been realised. Ambient applications require that the 
computing application be subsumed into the everyday context in an unobtru- 
sive manner with interaction modalities that are natural, simple and appropriate 
to both the individual user and their associated context. Within this paper, we 
consider the use of mobile intentional agents as potential key enablers in the 
delivery of ambient intelligent services. In particular, we compare and contrast 
two agent-based ambient intelligence case studies. 



1 Introduction 

Fundamental to realizing the ambient computing vision is the opportunistic and 
timely delivery of appropriate services to individual users or groups. Designing and 
developing system components so as to ensure effective but unobtrusive system usage 
is essential, particularly when interaction modalities are considered. How best to 
achieve this remains an open question. One approach that we are actively investigat- 
ing concerns the use of mobile intentional agents as the key enablers to realizing 
ambient computing applications. In particular, we commission an intentional mobile 
agent-based approach. Specifically we adopt a Belief Desire Intention (BDI) agent 
model. In the delivery of such agents we utilise the Agent Factory system which sup- 
ports the rapid fabrication of agent-based applications. 

Within this paper we consider two ambient intelligence case studies, namely Ea- 
siShop and Gulliver’s Genie, Both systems have adopted an agent-based design 
metaphor and, specifically, both have been realised through the Agent Factory devel- 
opment environment. We compare and contrast these case studies identifying some of 
the key issues that arise in the deployment of ambient computing applications. 



2 Agent Factory 

Agent Factory [1],[2],[3] is a cohesive framework that supports a structured approach 
to the development and deployment of agent-oriented applications. Specifically, 
Agent Factory supports the creation of a type of software agent that is: autonomous, 



P. Markopoulos et al. (Eds.): EUSAI 2004, LNCS 3295, pp. 339-350, 2004. 
© Springer-Verlag Berlin Heidelberg 2004 




340 



G.M.P. O'Hare, S.F. Keegan, and M.J. O'Grady 



situated, socially able, intentional, rational, and mobile. Practically, this has been 
achieved through the design and implementation of an agent programming language 
and associated interpreter. The language and interpreter in tandem facilitate the ex- 
pression of the current behaviour of each agent through the notions of belief and 
commitment. When augmented with a set of commitment rules, the agents’ behaviour 
is defined by the conditions under which commitments should be adopted. This ap- 
proach is consistent with the well-documented BDI-agent model [4], 

The framework itself is comprised of four layers that deliver: the agent program- 
ming language, a distributed run-time environment that delivers support for the de- 
ployment of agent-oriented applications, an integrated toolkit that delivers a visually 
intuitive set of tools, and a software engineering methodology that specifies the se- 
quence of steps required to develop and deploy agent-oriented applications with the 
preceding layers. Furthermore, Agent Factory adheres to F1PA standards through an 
Agent Management System (AMS) agent and a Directory Facilitator (DF) agent. 
Agent-oriented applications built using Agent Factory use these prefabricated agents 
to gain access to the infrastructure services provided by the run-time environment, for 
example, yellow and white pages services, migration services and so on. 



3 EasiShop 

The EasiShop [5], [6], [7] project encapsulates efforts that have been made to deliver a 
practical and efficient mobile shopping system. To support this vision, research has 
been undertaken into the synthesis of wired and wireless infrastructure with smart 
portable devices and agent based user interfaces to enhance the shopping experience. 
Of particular focus has been the desire to develop the appropriate components to 
enable a scalable mobile multi-agent trading platform. The user interface of EasiShop 
is shown in Fig. 1. 

The system is primarily targeted towards providing a convenient retail solution to 
the shopper and, to a lesser extent, enabling a new or enhanced revenue stream for 
retailers. Other efforts have endeavoured to address this application domain. Some, 
like Bargainfinder [8], are web-based auctioneering facilities whilst others, like My- 
Grocer [9], offer a similar type of mobile solution to that of EasiShop. A perspective 
on some of the issues inherent in this domain may be acquired upon consultation of 
the literature. 



3.1 Services 

EasiShop is primarily charged with the task of enabling cross-merchant comparison 
shopping. A user profile is maintained within the system. From this profile, a retailer 
may determine the extent of relevance that a particular user holds. When combined 
with the aspect of location determination, a powerful retailing channel can be real- 
ised. Retailers can automatically choose to focus efforts on users of interest. 
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Fig. 1 . EasiShop Graphical User Interface 



3.2 Pertinent Characteristics 

There are numerous characteristics which distinguish EasiShop from similar efforts. 
The first is that an approach based on intelligent agents is enlisted. More specifically, 
however, the system incorporates agents that conform to the notion of agency as 
formalised by Wooldridge & Jennings [TO], and, in this case, such agents adopt a 
reasoning strategy based on the Belief-Desire-Intention (BDI) paradigm. In this re- 
gard, the system is similar to Gulliver’s Genie. A second feature of particular distinc- 
tion is the employment of a reverse-auction model to procure commercial transac- 
tions. In this model, an agent which embodies the product requirements of the user, 
migrates to a centralised server where competing provider agents representing differ- 
ent retailers are invited to vie, using a predetermined auction protocol, for the custom 
of that user. The third distinguishing aspect of EasiShop is the utilisation of user pro- 
file data which, together with a log of previous transactions, can be used to deliver 
appropriate product offerings to the mobile user. Finally, a product ontology is fun- 
damental to the operation of EasiShop. The product ontology utilized is that of the 
UNSPSC product ontology together with product descriptions provided via XML. 
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3.3 Architecture 



The complete EasiShop architecture is comprised of a suite of agents residing both on 
three distinct nodes - the PDA, the Store Server and the Marketplace. All these agents 
collaborate to provide the required functionality (Fig. 2). Each agent is now briefly 
described. 




Fig. 2. Architecture of EasiShop 

GUI Agent: The tasks of the GUI Agent are twofold. First and foremost, it is con- 
cerned with controlling the onscreen components necessary to enable the user to 
create and maintain shopping lists and user profile. Secondly, the GUI Agent moni- 
tors the user’s behaviour and maintains records of this activity. From these records, 
the system attempts to more accurately predict which products will interest the user. 
Shopper Agent: Whilst ordinarily housed on the PDA, the Shopper Agent may de- 
cide to migrate to the Marketplace (via a conduit store - the active EasiShop Hotspot) 
to instigate the product auction process. The belief set of the Shopper Agent repre- 
sents a data set, which is essentially the user’s shopping list. At the marketplace the 
Shopper Agent is paired with interested Store Agents to procure those item(s) on the 
shopping list. 

Catalogue Agent: The main duty of this agent is to ensure that all product catalogue 
information (price, new products, etc.) is kept up to date in the product list of the 
PDA. To accomplish this, it liaises with the Storefront Agent in the active EasiShop 
Hotspot, whenever one is in range. This means that the user, when viewing and se- 
lecting products via the GUI Agent, is accessing the up-to-date product dataset. 
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Storefront Agent: When necessary, a certain proportion of communication is used to 
deliver product updates to the client (PDA) dataset. As previously mentioned, the 
Catalogue Agent interacts with the Store Agent to support this functionality. 

Store Manager Agent: This agent monitors Shopper Agents which have migrated to 
the Store. From here, the Store Manager Agent contacts the Marketplace Manger to 
request further migration to the centralised Marketplace. Should migration ensue, the 
two agents (Store Agent and Shopper Agent) are transferred to the Marketplace 
where the auction will take place. 

Store Agent: As previously outlined, the belief set of the Shopper Agent incorporates 
the shopping list of the user. Similarly, the belief set of the Store Agent is representa- 
tive of the real world attributes of the store. This may include types of products on 
offer as well as stock and pricing information. The Store Agent is spawned and trans- 
ferred by the Store Manager to the Marketplace to enter an auction as required. 
Marketplace Manager Agent: Coordination of the centralised Marketplace is per- 
haps the most important factor in enabling the arena in which agents may buy and sell 
products. The Marketplace Manager Agent organises and administers migration to the 
marketplace as well as allocating system resources required in the auctioneering proc- 
ess. In practical terms, this means that a public market list, containing information on 
exactly what products are sought by whom, is maintained. This list is monitored by 
both the Stall Manager agents, which request to administer individual auctions and by 
the Store Agents who may request entrance to a particular auction, should an auction 
be of interest. 

Stall Manager Agent: This agent is charged with the task of coordinating the auction 
process. After acquiring an auction from the market list, the auction is deemed to be 
open. A deadline for interested parties as well as opening and closing time values are 
set. At this point, the type of auction is also made known. This can be any one of a 
number of auction protocols - e.g. Dutch or Vickrey [11]. It is important to note at 
this point that the Stall Manager is what is termed as an environmental agent. This 
means that the number of Stall Manager agents in the system is not fixed and may be 
adjusted by the Marketplace Manager agent as required. More Stall Manager Agents 
might be required at busy times, for example. The bid, accept and reject messaging of 
participating auctions is coordinated before the auction closes, at which point a win- 
ner is declared. 



3.4 Implementation 

The EasiShop (client-side) system is installed on a standard PDA- the iPAQ 3870. 
Bluetooth is used to determine the user’s location. When the user is detected as being 
in the broadcast range of a particular EasiShop Hotspot (an area adjacent to a store 
within which agent communication may take place), the user’s location may be de- 
termined to within an accuracy of approximately twenty metres. From a software 
perspective, the system components are implemented in Java. Kaffe, a free JVM 
licensed under the terms of the GNU General Public License, is used as the runtime 
environment. The operating system is Linux. Bluetooth functionality is implemented 
via a runtime interface to the Bluez [ 12] bluetooth protocol stack. 
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4 Gulliver’s Genie 

Gulliver’s Genie [13], [14] has investigated various issues relating to the practical 
realization of the mobile computing paradigm. More specifically, suitable architec- 
tures for realizing mobile computing applications, thereby facilitating the delivery of 
services including those with a substantial and dynamic multimedia component have 
been the subject of particular attention. 

From an application domain perspective, the Genie currently focuses on addressing 
the needs of roaming tourists, as this domain represents a microcosm of the broad 
issues facing both mobile application developers and end-users. In this the Genie is 
not a unique endeavor as a cursory examination of the literature will testify. One 
project similar in scope and objectives is CRUMPET [15], a result of the EU 1ST 
research programme. The indoor scenario, for example, museums, art galleries and 
exhibitions, has also been the subject of much research. Indeed, a useful overview of 
activity in this area may also be found in Bellotti et al [16]. 



4.1 Services 

In theory, the Genie can provide any standard location-aware service. However, when 
the needs of tourists are considered, two services are essential: navigation support and 
the provision of cultural content. Given that tourists are almost inevitably exploring 
unknown territory, a navigation-support component is of practical importance. While 
roaming, tourists will encounter sites of cultural significance. Various aspects con- 
cerning such sites may of interest to the tourist. In addition, there may be relation- 
ships between those sites and other attractions that the tourist has encountered during 
their travels. Such relationships may not be obvious without careful research on the 
visitor’s part. As the tourist’s spatial context and personal profile are known, there are 
significant opportunities for enhancing their experience through the proactive and 
selective delivery of appropriate content. The service is realized in the Genie in the 
form of rich multimedia presentations concerning the attractions encountered. An 
example of such a presentation may be seen in Fig. 3. 



4.2 Pertinent Characteristics 

Gulliver’s Genie differs in a number of ways from other efforts in this area. However, 
there are two strategies that it adopts that are of special interest. The first is that, 
similar to CRUMPET, it adopts an approach based on intelligent agents. More spe- 
cifically, however, it incorporates agents that conform to the strong notion of agency 
as articulated by Wooldridge & Jennings, and, in this case, such agents adopt a rea- 
soning strategy based on the Belief-Desire-Intention (BDI) paradigm. Traditionally, 
the computational cost of deploying such agents on lightweight devices was prohibi- 
tively expensive. Recent developments in hardware and software have rendered such 
concerns obsolete, except of course in the case of the most basic devices. 
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Fig. 3. A Sample Presentation for the Church on the Campus 

A second feature of particular interest is the strategy the Genie adopts for dissemi- 
nating content. Users have high expectations and seek immediate access to desired 
content at any time, in any place and, increasingly, on any device. Meeting such ex- 
pectations within the confines of limited devices and the limited bandwidth availabil- 
ity of wireless networks, particular the cellular variety, is difficult. In an effort to 
address this, the Genie adopts a strategy that has been termed intelligent precaching 
[17]. In brief: a model of the tourist’s environment is maintained which contains, 
amongst other things, specific details of the various tourist attractions within it. If this 
model is considered in light of the tourist’s spatial context, that is, their position and 
orientation, as well as their personal interest profile, their likely future behavior can 
be estimated with a reasonably high degree of certainty. Therefore, the appropriate 
content can be downloaded to their device in just-in-time basis. As this content is 
inherently adapted to the tourist’s context, as well as personalized to conform to their 
individual profiles, a satisfactory experience may be reasonably anticipated. 



4.3 Architecture 

In essence, Gulliver’s Genie comprises a suite of agents residing both on the client 
and on the server. All these agents collaborate to deliver the necessary services to the 
tourist (Fig. 4). Each agent is now briefly described: 

Spatial Agent: To determine a tourist’s spatial context, this agent autonomously 
monitors the GPS signal and interprets it accordingly. This, it periodically broadcasts 
to other interested agents. This agent is unique in that it harnesses its capacity to mi- 
grate. Though GPS is the de facto standard for position determination at present, 
systems using cellular network techniques are envisaged in the future. By using a 
mobile agent, the Genie can be deployed on those devices that utilize cellular tech- 
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niques when they become available, as the appropriate agent encompassing the logic 
for handling cellular network positioning may be dispatched to the device. 




PDA Agent 



O 



} Mobile Agent 
Server Agent 



Fig. 4. Architecture of Gulliver's Genie 



Cache Agent: Intelligent precaching is one of the Genie’s defining characteristics 
and the Cache Agent is responsible for implementing this strategy on the client. An 
environmental model is provided by the GIS Agent on the server. By considering this 
model in light of the tourist’s movement, it identifies possible attractions that the 
tourist may visit. A multimedia presentation is requested from the Presentation Agent 
in anticipation that it will be downloaded by the time the tourist encounters the at- 
traction in question. Should this not occur, the presentation is simply discarded. 

GUI Agent: Controlling the interface on the tourist’s device is the main task of the 
GUI Agent. In normal navigation mode, an electronic map is displayed with the cur- 
rent position and orientation highlighted. This is, of course, continuously updated as 
the tourist moves, courtesy of updates from the Spatial Agent. Should the tourist 
encounter an attraction for which a presentation has been precached, the Cache Agent 
prompts the GUI Agent to display the presentation, monitor the tourist’s interaction 
and provide feedback to the Profile Agent. 

Registration Agent: Tourist must first register for Genie services. The Registration 
Agent takes care of this process as well as assigning Tourist Agents to individual 
tourists in response to requests for Genie services. 

Tourist Agent: All tourists registered for Genie services are assigned their own indi- 
vidual agent, termed Tourist Agents, on commencing a session. In agent parlance, 
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such agents are cloned from the tourist agent template. Essentially, this agent is the 
tourist’s interface to the services offered by the Genie. Acting on prompts from the 
Spatial Agent, it arranges the construction of environmental models in conjunction 
with the GIS Agent and prompts the Presentation Agent to maintain an updated list of 
presentations in anticipation of requests from the Cache Agent. Though there is some 
computational overhead in assigning an agent to each tourist, such an approach en- 
sures the future scalability of the Genie as the number of concurrent tourists increase 
and a wider variety of services are offered. 

GIS Agent: Accurate environmental models are necessary for the successful opera- 
tion of the Genie. Such models are provided to the Cache Agent on the client as well 
as to the Presentation Agent for dynamic presentation pre-assembly on the server. 
Profile Agent: User profiles are essential to realizing adaptivity and personalization. 
The Profile Agent is responsible for maintaining user profiles and updating them in 
light of ongoing tourist interactions with the Genie. 

Presentation Agent: The provision of personalized multimedia presentations is a 
core tenet of the Genie’s raison d'etre. Such presentations are assembled in light of 
the tourist’s profile and their current environmental model. This server-side presenta- 
tion repository is continuously updated in light of tourist movement and changes to 
their individual profiles. 



4.4 Implementation 

At present, the Genie runs on a standard PDA, namely an iPAQ. GPS, which gives a 
position reading to within 20 metres on average, is used for determining location. 
Orientation can also be derived from GPS, albeit in an approximate manner. For data 
communications, the standard 2.5G technology, the General Packet Radio Service 
(GPRS) is used. In each case, a corresponding PCMCIA card was procured. Both 
cards were then incorporated into the IPAQ via a dual-slot expansion sleeve. From a 
software perspective, the client components are implemented in Java and a commer- 
cial JVM, namely Jeode, is used as the runtime environment on the IPAQ. All com- 
munication with the server takes place over a standard HTTP connection. 

On the server side, the Agent Factory runtime environment is deployed. This is 
augmented with a sophisticated database that supports multiple data types including 
geospatial data and multimedia-related data. A toolkit for populating this database 
forms an indispensable component. 



5 EasiShop Meets Gulliver’s Genie 

The development of these case studies has enabled the harvesting of many experi- 
ences that are pertinent to the design and delivery of ambient intelligence generally. 
We will briefly represent these. 

Intelligence: Both systems are imbued with intelligence. The metaphor adopted in 
each case is the use of mobile intentional agents. Within both case studies this model 
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was found appropriate and elegant enabling the modular design of a highly complex 
system. Initial reservations concerning the amenability of a strong agent-based 
approach in terms of software footprint and computational overhead proved to be 
unfounded. The use of standard JVMs (Kaffe & Jeode) on differing operating systems 
(Linux, and Windows) has demonstrated the applicability of this approach. Further- 
more the modular design has in both cases facilitated the organic growth in the archi- 
tecture with additional agents being introduced as the systems developed in maturity. 

Context-Awareness: User context is intrinsic to ambient intelligence. A variety of 
parameters typically contribute to a rich user context. Key amongst these is the ability 
to determine the user location. In EasiShop, Bluetooth is utilised to accomplish this 
while within Gulliver’s Genie, the combination of the pervasive GPS and, in time 
Galileo, is adopted together with an electronic compass for position determination. 
User location acts as a crucial filter for service admissibility. Both systems have dem- 
onstrated that relatively simple user profiling techniques can be perceived by users as 
being much deeper in terms of their power to personalise a service or interface. 

Wireless Networking: Wireless networks are core to ambient intelligence. The 
nature of the network may vary in terms of such key descriptors as availability reli- 
ability and capacity but invariably the same issue arises: how within restricted band- 
width to give the illusion of limitless capacity. Techniques like intelligent precaching 
as deployed by Gulliver’s Genie go some way towards sustaining this illusion. Other 
techniques that have been investigated include agent tuning and the use of MPEG-4 
content strata to render certain content layers prior to the arrival of more media-rich 
content. Gulliver’s Genie has been deployed using GSM, GPRS and High Speed 
Circuit Switched Data (HSCSD). EasiShop on the other hand uses the Bluetooth ac- 
tive EasiShop hotspot as the mechanism and time constraint within which the migra- 
tion of the shopper agent to the market place is achieved. Thereafter after participat- 
ing in the auction process on the user's behalf, it will return the results via GSM as an 
agent migration or a simple purchase update via the ubiquitous SMS. Opportunistic 
agent migration is central to EasiShop. 

Adaptivity & Personalisation: All users wish to perceive themselves as individu- 
als with idiosyncratic needs. In reality they are not, however pandering to this user 
aspiration has proven to be central to user adoption. System adaptivity has been in- 
corporated within the ubiquitous shopping process through the product recommenda- 
tions which take due cognizance of the users style preferences, age, disposable in- 
come location, gender and size together with their disposition to brands and products 
based upon previous purchases. Gulliver’s Genie adapts its content based upon the 
monitoring of the follow up links that the tourist requests in terms of additional in- 
formation. The user profile is dynamically updated ensuring that the content a teenage 
tourist receives at location X is different from that supplied to a senior citizen who 
adores ancient antiquities at location Y. 



6 Conclusion 

This paper advocates the use of mobile intentional agents as a key enabler in the de- 
livery of ambient intelligence. Within this paper we consider two ambient intelligence 
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case studies, namely those of EasiShop and Gulliver’s Genie. Both systems have 
adopted an agent-based design metaphor and specifically both have been realised 
through the Agent Factory development environment. 

We compare and contrast these case studies examining the key problems that 
manifest themselves in the role out of ambient intelligent systems. Experiences ac- 
crued from their development act as invaluable input to other developers of ambient 
systems. Both case studies stand as testimony to the efficacy of an agent-based ap- 
proach in the delivery of scalable ambient intelligence systems. Ongoing work in- 
cludes detailed user evaluations from which it is hoped improved design heuristics 
can be derived for the realisation of ambient intelligent applications and services. 
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Abstract. We investigate the application of a logical model of agency, 
known as the KGP model, to develop agents for ambient intelligence ap- 
plications. Using a concrete scenario, we illustrate how the logical formal- 
ism employed by a KGP agent allows a person to access the surround- 
ing ambient through the agent in a transparent maimer. We evaluate 
our claims by implementing the resulting interactions in PROSOCS, a 
prototype multi-agent systems platform that allows KGP agents to be 
deployed as components of ambient intelligence applications. 



1 Introduction 

The vision of Ambient Intelligence (Ami) is a society based on unobtrusive, 
often invisible interactions amongst people and computer-based services in a 
global computing environment [1]. Services in Ami will be ubiquitous in that 
there will be no specific bearer or provider but, instead, they will be associated 
with all kinds of objects and devices in the environment [7], which will not bear 
any resemblance to computers. People will interact with these services through 
intelligent and intuitive interfaces embedded in these objects and devices, which 
in turn will be sensitive to what people need. 

For a large class of the envisaged Ami applications, the added value of these 
new services is likely to be for ordinary people, and more specifically on what 
people are trying to do in ordinary social contexts. Such a requirement begs for 
technologies that are transparent. Transparency is here interpreted broadly as 
the ability of a person to understand, if necessary, the functional behaviour of 
an object or device in the environment. Put simply, transparency should bring 
Ami interactions closer to the way people are used to think rather than the way 
machines operate. Ami applications, as a result, will need the ability to process a 
form of logic with computational features in order to make interactions between 
people and objects transparent not only at the interface level. 

Another challenge posed by the Ami vision is that the electronic part of 
the ambient will often need to act intelligently on behalf of people. Ami cannot 
afford to stay within a direct manipulation interface where user requests result 
in monolithic system responses. Rather, the challenge here is that the ambient 
or, more specifically conceptual components of it, will often need to be not 
simply spontaneous but also proactive, behaving as if they were agents that act 
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on behalf of people. It would be more natural, in other words, to use the agent 
metaphor in order to understand components of an intelligent ambient. An agent 
in this context can be a software (or hardware) entity that can sense or affect 
the environment, has knowledge of the environment and its own goals, and can 
pro-actively plan to achieve its goals or those of its user(s), so that the combined 
interactions of the electronic and physical environment provide a useful outcome 
to one or more people. 

The contribution of this work is that it investigates the application of a logical 
model of agency, known as the KGP model, to build applications with the aim 
of providing a more transparent Ami. Agents in this context offer intelligent ser- 
vices to their users and manage spontaneous, ad-hoc interactions over distributed 
networks embedded in physical environments. The specific contribution of the 
work is that it shows how the KGP model of agency can be programmed in order 
to support the transparency and agency which might be required by Ami. The 
implementation of an application is discussed using PROSOCS, a multi-agent 
system platform that combines computational logic tools and Peer-to-Peer (P2P) 
Computing to support the envisaged Ami interactions. 

2 KGP Agents 

KGP (Knowledge, Goals and Plan) is a model of agency intended to represent 
the internal or mental state of an agent in a logical manner. The internal state 
of a KGP agent is a triple (KB, Goals, Plan), where: 

— KB describes what the agent knows of itself and the environment and con- 
sists of separate modules supporting the different reasoning capabilities. For 
example, KB p i an supports planning, KB g d goal decision, etc. One part of 
the KB, called KBq reflects changes in the environment and is typically 
updated by the agent when it observes the environment through its sensing 
capability. We will assume that KB 0 is the only part of the knowledge of 
the agent that changes over time. 

— Goals is a set of properties that the agent has decided that it wants to 
achieve by a certain time possibly constrained via some temporal constraints. 
Goals are split into two types: mental goals that can be planned for by the 
agent and sensing goals that can not be planned for but only sensed to find 
out from the environment whether they hold or not. 

— Plan is a set of “concrete” actions, partially time-ordered, of the agent by 
means of which it plans (intends) to satisfy its goals, and that the agent can 
execute in the right circumstances. Each action in Plan relies upon precon- 
ditions for its successful execution. These preconditions might be checked 
before the actions are executed. 

A brief description of the transitions shown in Figure 1 is provided below: 

— Passive Observation Introduction (POI) changes KB 0 by introducing unso- 
licited information coming from the environment or communications received 
from other agents. Calls the Sensing capability. 
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Fig. 1 . In the KGP Model of Agency the knowledge in the agent’s “mind” is operated 
by a set of reasoning capabilities that allow the agent to perform planning, temporal 
reasoning, identification of preconditions of actions, reactivity and goal decision, to- 
gether with a sensing capability for the agent to perceive the environment in which it 
is situated. The capabilities are used by a set of transition rules describing how the 
mental state changes as a result of the agent being situated in an environment. Agent 
behaviour is obtained by combining sequences of transitions determined by reasoning 
with a cycle theory that controls this behaviour in a flexible manner. 



— Active Observation Introduction (AOI) changes KBo by introducing the out- 
come of sensing actions for properties of interest to the agent; these proper- 
ties are actively sought. Calls the Sensing capability. 

— Sensing Introduction (SI) transition adds to the current Plan new sensing 
actions for sensing the preconditions of actions already in Plan, and uses 
the Sensing capability. 

— Plan Introduction (PI) changes part of the Goals and Plan of a state, ac- 
cording to the output of the Planning capability. This transition uses also 
the Identification of Preconditions capability, in order to equip each action 
A in the set As computed by Planning, with the set of preconditions for the 
successful execution of A. 

— Goal Introduction (GI) changes the Goals of a state by replacing them with 
goals that the Goal Decision capability decides to have highest priority. 

— Reactivity (RE) is responsible for updating the current state of the agent by 
adding the goals and actions returned by the Reactivity capability. As with 
PI, this transition too uses the Identification of Preconditions capability, in 
order to equip each action A in the set As computed by Reactivity, with the 
set of preconditions for the successful execution of A. 
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— Goal Revision (GR) revises Goals, e.g. by dropping goals that have already 
been achieved or that have run out of time, by using the Temporal Reasoning 
capability and by checking the temporal constraints of the goals. 

— Plan Revision (PR) revises Plan, e.g. by dropping actions that have already 
been executed successfully or that have run out of time, by checking the 
temporal constraints of the actions. 

— Action Execution (AE) is responsible for executing all types of actions, thus 
changing the KBq part of KB by adding evidence that actions have been 
executed. Calls the Sensing capability for the execution of sensing actions. 

The model allows to deal with partial, dynamically evolving information, which 
is reasoned upon both via the Planning, Reactivity, and Temporal Reasoning 
capabilities. For more details on the KGP model the reader is referred to [4]. 

3 An Ami Scenario Based on KGP Agents 

Francisco Martinez travels from Spain to Italy. He makes his trip easier 
by carrying with him a personal communicator, a device that is a hybrid 
between a mobile phone and a PDA. The application running on this 
personal communicator provides the environment for an agent, a piece 
of software that augments the direct manipulation interface of the device 
with proactive information management within the device and flexible 
connectivity to smart services, assumed to be available in objects avail- 
able in the global environment Francisco travels within. When Francisco 
arrives in Rome, the agent avoids a situation where Francisco has to 
unnecessarily buy a museum ticket, helps Francisco to find other people 
to share a taxi going from a hotel to a train station, and orders the taxi 
by holding a reverse auction to obtain the minimum price possible. 



3.1 Reasoning Features of KGP Agents 

To build the mental state of a KGP agent for a scenario like the one presented 
above we use well-understood reasoning techniques. We rely upon Abductive 
Logic Programming (and in particular the work described in [3]) to support the 
reasoning required for the capabilities of planning, reactivity, and identification 
of preconditions of actions. We also use Logic Programming with Priorities to 
support the generic functionality required for the goal decision capability and 
the development of cycle theories that control the agent’s behaviour. Here, we 
focus on the specification of a KGP agent that reacts and plans in the environ- 
ment in which it is situated. In Figure 2 we present a specialised version of the 
event calculus [5,8] that a KGP agent uses, appropriate to fit the KGP mode of 
operation (see [6] for more details). 

In the example programs that follow we will give only the domain-dependent 
components of the knowledge base, namely (and typically) the definitions of the 
event calculus predicates initially , initiates, terminates and precondition, as 
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holds-at{F , T 2 ) <— happens(0 , Ti) , initiates(0 , Ti,F), 

T\ < T 2 , -1 clipped(T\,F,T 2 ). 

holds-at(-^F,T 2 ) <— happens(0,Ti),terminates(0,Ti, F), 

Ti < Ti, - ' declipped(Ti, F,T 2 ). 
holdsMt{F,T ) t— initially (F) , 0 <T,-> clipped(0, F,T). 

holds-at(~<F , T) <— initially (~<F), 0 < T,-> declipped(0 , F, T). 

clipped(Ti, F,T 2 ) 4 — happens{0 , T),terminates(0,T,F),T\ < T < T 2 . 

declipped(Ti, F, T 2 ) 4 — happens(0 , T),initiates(0,T, F),T\ < T < T 2 . 
clipped(Ti, -F, T 2 ) t— observed(~>F,T),T\ < T < T 2 . 
declipped(Ti , F, T 2 ) <— observed(F,T),T\ < T < T 2 . 
holdsMt{F,T 2 ) t— observed(F,Ti),Ti < T 2 ,-' clijyped(T\, F, T 2 ). 
holds-at(->F,T 2 ) <— observed(-<F,Ti),Ti < T 2 ,-' declipped(Ti, F,T 2 ). 

happens(0,T) t— executed(0 , T) . 

happens(0,T) <— observed(0,T' ,T). 

happens(0,T) t— assume -happens (0,T). 

holds-at(F,T) 4 — assume_holds-at(F,T). 

Integrity Constraints: holds -at(F,T), holds -at{—^F,T) =*- false. 

assume-happens(A,T),precondition(A, P) => holds -at(P,T). 

Fig. 2. Extract of how a KGP agent situated in an environment concludes that a fluent 
F holds at a time T. Fluents holding at the initial state are represented by what holds 
initially. Fluents are clipped and declipped by events happening at different times. 
Events that happen will initiate and terminate properties holding at or between times. 
Fluents will also hold if they have been observed or result from actions that have been 
executed by the agent. An operation can be made to happen and a fluent can be made 
to hold simply by assuming them. In KGP we also use domain-independent integrity- 
constraints, to state that a fluent and its negation cannot hold at the same time and, 
when assuming (planning) that some action will happen, we need to enforce that each 
of its preconditions hold. 



well as any other definitions of interest. In this context we will assume that 
(timed) action operators are represented within the ( Plan component of the) 
state of an agent in the form: 
oy{argi , arg 2 , .... arg n , T) 

where op is the operator of the action, T the time argument of the action, and 
argi , ..., arg n any other (ground) arguments for the action. Also, we will assume 
that actions are represented within KB p i an , KB react , KBpp> as: 
assume -happens (op(ar g\, arg 2l ..., arg n ),T). 

Similarly for timed fluent literals. Furthermore, we will give KBq as a set of 
facts of the form: 

executed(op(argi, arg 2 , ..., arg n , T),t). 
observed(lit(argi, arg 2 , ..., arg n , T),t). 
observed(op(arg±, arg 2 , ..., arg n , rO), r). 

where op is an action operator, lit is a fluent literal, T is an existentially quanti- 
fied variable, r, rO are ground constants (the concrete time of execution or ob- 
servation). However, we will assume that the capabilities of planning, reactivity 
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and temporal reasoning (and thus goal decision that uses temporal reasoning), 
use the ground variant of KBq, namely the set of facts of the form 
executed(op(argi,arg 2 , ..., arg n , r)). 
observed(lit(argi,arg 2 , ..., arg n , r)). 
observed(op(argi, arg 2 , ..., arg n , rO), r). 

Finally, we will assume that the behaviour of agents is given via operational 
traces, where passive observations from the environment (via POI) is always 
treated as some sort of interrupt. 

3.2 Not Buying a Museum Ticket 

Francisco has asked his agent to buy a museum ticket for some (state-run) mu- 
seum that Francisco wants to visit. The agent has developed a plan to buy it, 
however, before executing the plan the agent observes that it is European Her- 
itage Day (ehd) from a “message” sent by another agent ma (the museum agent) 
and, as a result, all state-run museums in Europe are free that day. Based on 
this observation its goal is achieved, the buying action is not executed, and both 
goal and plan are deleted. 

State. The agent has a goal gl to buy a ticket within a specific time-limit, as 
well as a plan to buy it by executing an action al: 

Goals = {g\ = (have(ticket,T), . L, (T < 10}), 

Plan = (ai = (buy (ticket, T ') , gi, have-money (T ') , {T' < T}). 

KBtr contains: 

initiates(ehd , T, have(ticket)) . 
initiates(buy(0) ,T, have(O)). 
precondition(buy (O ) , havejmoney) . 

The remaining components of KB do not play any role in this example and can 
be considered to be empty. 

Trace. Francisco can eventually trace the behaviour of the agent (the short 
names for transitions POI, GR and PR below have already been described in 
section 2): 

POI: observed(ma , ehd(Time ) , 6 ). 

GR: at time 8, g\ is eliminated from Goals as it achieved. 

PR: at time 10, a\ is eliminated from Plan, as g\ is not in Goals anymore. 

3.3 Sharing a Taxi 

Francisco’s agent tries to find a person that is likely to accept to share a taxi 
with Francisco from the hotel to the train station. Francisco’s agent has asked all 
other available agents in the hotel society for the resource, but did not manage 
to get it, so it suspends its goal. But, when a new agent becomes available in 
the hotel, then Francisco’s agent can go back and ask the new agent. 
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State. The agent has a goal to reach the station and no plan for it: 
Goals = {gi = (reachstation(T), P,{T < 20}), Plan = {}. 

K B p i an consists of the following domain-dependent knowledge: 



holdsMtireachstation , T) 

holds -at(acceptedshar e, T') ,T' < T. 
holds -atfacceptedshare , T) «— 

holds -at (customer (O) ,T'), 

happens(tell(f , O, request(shareJ.axi ), d(ex 7, T')),T'), 
holds Mt(accepted jrequest(0 , shareJaxi) ,T ") , 

V < T" < T. 

precondition(tell(f , O, requestfshareJtaxi ), D), -^already jrej ected(O)) 
initiates(tell(0 , /, reject(request(sharejtaxi)),D), T, 
already jrej ected(0)) 

initiates(tell(0 , /, accept(request(shareJ.axi )), -D), T, 
accepted jrequest(0 , sharejtaxi)). 

These definitions basically impose to ask one after the other, in sequence, agents 
of customers of the hotel, without asking the same agent twice, until one accepts 
the offer. Knowledge about customers in the hotel can be drawn from the fol- 
lowing domain-dependent definitions, which is a simplification of the ones found 
in [10], where to access the electronic environment of a hotel the agent joins an 
artificial agent society residing in the hotel’s Ami infrastructure: 

initially (customer (ol)) . 
initially (customer (o2)) . 
initially (-ialready Jrej ected(ol)) . 
initially (^already jrej ected(o2)). 
initiates(join(C) ,T, customer (C)) . 

To join an artificial society the agent executes the action of joining itself: 

executable(j oin(f) ,T)) . 

executable(tell(f , C , S, D )) <- /. 

assume Jiappens(A,T), not executable(A) => false. 

As before, we assume that KB$ is initially empty and that the other components 
of KB do not play any role (they can be considered empty too). 

Trace. The internal transitions of the agent can be summarised as follows: 

PI: action a„i and goal g 0 i are added to Plan and Goals, respectively: 

a Q i = ( tell(f , ol, request (share -taxi) ,d(ex7 , T '), T'), gi,—>already-rejected(ol), {T / <T // }). 

9oi — (accept ed-request(ol, share-taxi, T"), g±, {T" < T}). 
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AE: 0j O 1 is CXCClltcd. - executed(tell(f,o\, r equest(shar e-taxi), d(ex7,T'),T'), 1). 

POI: observed(oi , tell(o \ , /, re j ect(request{shar e-taxi )) , d(ex 7, T r ), 3), 3). 

GR & PR: < 7 oi and a 0 i are deleted from Goals and Plan , respectively. 

PI: action a 0 2 and goal c / 0 2 are added to Plan and Goals, respectively: 

a Q 2 = (tell(f,o2,request(share-taxi),d(ex7, T*), T* ) ,g± ,—ialready _rej ected(ol) ,{T*<.T** }) . 

9o2 — (acceptedjrequest(o2, share-taxi), T**), gi, {T** < T}). 

AE: Cl 0 2 is executed - executed(tell(f , 02 , request (share-taxi) , d(ex7,T*),T*), 5). 

POI: observed(c> 2 i tell(o 2 , /, re j ect(request(shar e-taxi)) , d(ex 7, T*), 6), 6). 

GR & PR: g 0 2 and a 0 2 are deleted from Goals and Plan, respectively. 

POI: new agent joins observed(o3,join(o3,3), 10). 

PI: action a Q 3 and goal g Q 3 are added to Plan and Goals , respectively: 

a 0 3 = (pell(f ,o3,r equest(shar e-taxi) ,d(ex7 , T') ,T') ,gi ,—ialready -re jected(oS) <T !! }). 

g Q 3 = (accepted-request(o3, share-taxi), T 1 '), gi , {T' ! < T}). 

AE: a Q 3 is executed - executed(tell(f,o3,request(share-taxi),d(ex7, T ! ),T ! ), 12). 

POI: observed(o 3 , tell^o^, f, acceptor equest (share -taxi)) , d{ex 7, T ! ), 15), 15). 

GR and PR: g 0 3 , gi and a 0 3 are deleted from Goals and Plan, respectively. 
The presence of the new agent has allowed / to achieve its goal successfully. 



3.4 Ordering a Taxi 

In order to get a taxi with the lowest fare, Francisco’s agent / decides to hold a 
reverse auction, where the agent sets the maximum price Francisco is willing to 
pay, starts the auction by broadcasting to all available taxis in the ambient, and 
after collecting all bids from the different taxis, it notifies the winner (namely 
the agent representing the taxi offering the lowest fare) as well as all the other 
bidders (namely the other agents representing the rest of the taxis who offered 
higher fares than that accepted). 



Bidder agents. The way taxi agents that act as bidders react is described in 
the A'B*). act , KBl\ act and KBl 3 eact , all containing the rule: 

observed(tell(A, ti, openauction(taxi,T End, T Deadline), Id, TOpen), 
holds jit{costJ,axi(C),T Open) 

=> 

assume Jiappens{tell(ti, A, bid(I, C),Id),T), 

TOpen < T < TEnd. 

In other words, if somebody opens an auction asking for a taxi, then reply with 
a bid. We assume that the destination of the journey is not an issue in this 
example, and that each of the taxi agent will have a concrete cost in their KB, 
e.g. KBl} eact , KB* 2 eact , KB* 3 eact would have, respectively: 

initially (cost Taxi( 3)). 
initially (cost Taxi( 5)). 
initially(costdaxi( 7)). 
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Auctioneer agent. In formulating the knowledge base of /, we will assume the 
existence of an external object that we will refer to as ac (standing for auction 
calculator ) which / will interact with in order to get a solution to the (simple) 
optimisation problem it has to solve in order to decide the winner of the auction 
that / will open. We will not make any assumptions here on the internal struc- 
ture of ac, but we will assume that it is “wrapped” in such a way that / can 
communicate with it. Moreover, we will assume that / and ac are synchronized, 
in the sense of sharing the same clock, and that ac is committed to return a 
solution of the optimisation problem by the auction deadline that / passes to it. 
The following is the set of reactive rules in KB{. eact : 

holds -atjwant -taxi, T), 
holds Jit{availableTaxi(X) ,T) 

=> 

assume Jiappens[tell(f , X, openauction(taxi, T + 5, T + 10), d(fpsb, T)),T). 

In other words, whenever in need of a taxi, / opens an auction for a taxi, commu- 
nicating to all taxi drivers it believes to be available the end time of the auction 
(T + 5) and the deadline or deciding the winner of the auction (T + 10). 

observed{C, tell{C, f, bid{I, C), Id, TO), T), 

executed(f , tell{f, C, openauction{I ,T End, T Deadline), Id, T),T 1), 

T1 < TO 
=> 

assume Jiolds{tell{f , ac, calculate^ d , I,TEnd, T Deadline) , d{ac, T')),T'), 

T <T' < T Deadline — 2. 

Based on this rule, agent / sends every bid it receives to the auction calculator 
ac, as soon as it receives it. 

observed(ac, tell(ac, f, results{Id, I, T End, T Deadline, (C, Q), R) , TO) , T) 
executed(f , tell{f, X, openauction(I , T End, T Deadline), Id, T),T 1), 

T1 < TO 
=> 

assume Jwlds{tell{f , C, answer{win, I, Q), Id),T'),T' < TDeadline. 

With the rule above, agent / is in a position to notify the winner of the auction 
as soon as it receives the decision from ac (we assume here that ac sends the 
notification by the deadline of the auction and thus / is able to meet this deadline 
in turn). 

observed{ac, tell(ac, f, results(Id, I, T End, TDeadline, (C, Q), R), TO), T) 
executed{f , tell(f, X, openauction{I ,T End, T Deadline) , I d) , T) , 

T1 < TO 
=> 

assume Jiappens(tell(f , C, answer(lose, I , Q), Id),T), TEnd < T < TDeadline. 

Similarly, agent / is now in a position to notify the losers of the auction as soon 
as it receives the decision from ac using its KB^ eact : 
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initially(available(ti)) . 
initially (av ailablefo)) ■ 
initially (av ail able(t 3)) . 

We expect that the auction calculator will use the steps: 

step 1: findalLbids(Id,I,TEnd,ValidBids), 
step 2: findjminimum{y alidB ids, (Winner, Bid)), 
step 3: Lost = minus(ValidBids, ( Winner , Bid)). 

so that, all the bids are computed ( ValidBids ), then the winning bid is deter- 
mined (the tuple ( Winner , Bid)), and then the list of bids that remained (Lost) 
is evaluated. 



Auction Interactions. To save space, instead of giving the detailed traces of 
each agent participating in the auction, we present here a summary of the inter- 
action between the auctioneer and the taxi agents. The auctioneer (/) generates 
(via RE) and executes (via AE) an action to open the auction. Then the bidders 
(taxis) record the communicative act via POI and put forward their bids (offer- 
ing fares of 3, 5, 7, respectively), again by RE and AE. Once all taxi agents send 
their bids, via POI, / records the bids and, via RE, it sends them to the ac. After 
having received the final calculations from ac, recorded via POI, / generates, via 
RE, the communicative actions to make t\ a winner and the other two losers, 
and, via AE, it executes the actions. 

3.5 Implementation Issues 

We have implemented the interactions that we have discussed in the previous 
section using PROSOCS [9], an experimental platform that allows the deploy- 
ment KGP agents in open computing environments such those characterising 
Ami. We have been using the platform’s basic interface presented in Fig. 3 to 
show whether a KGP agent will produce the expected behaviour. Using this basic 
interface we can inspect the various parts of the behaviour of the agent and how 
the computational trace of the agent can be examined. An addition advantage 
of the PROSOCS platform is that it allows new deployed agents in a network 
to be discovered automatically, using the P2P protocols available in the JXTA 
project [11]. JXTA makes PROSOCS a suitable platform for Ami applications, 
especially those that require ad-hoc networks for their development. PROSOCS 
also provides a society infrastructure [2], that allows interactions between agents 
to be checked upon whether they conform to specific social rules. 

4 Concluding Remarks 



We have illustrated how to apply the KGP model of agency to develop ambient 
intelligence applications. We have shown how the logical formalism employed by 
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Fig. 3. A screen-shot of the agent interface supported in PROSOCS. A KGP agent in 
PROSOCS is called a computee, to underline the fact that the the mind of the agent 
is developed in computational logic. On the top-left part of the figure the components 
of the mental state of the agent are shown, e.g. the KBq, the goals and the plan. 
Depending on which one of these components is selected, the content of the component 
is shown in the top-right part of the figure. The computational trace is shown in 
the bottom-left part, while any observed events or executed actions are shown in the 
middle- and bottom-left part of the screen. 



a KGP agent allows a person to access the surrounding ambient in a transpar- 
ent manner. Transparency in KGP is the agent’s ability to provide a narrative 
of its mental state, in terms of transitions, that allow a person to understand 
the agent’s behaviour in achieving a goal. We have shown how three concrete 
examples can be formulated in the model and we have briefly discussed their 
implementation in the prototype platform PROSOCS. 

Some aspects of the KGP model both generally and for Ambient Intelli- 
gence applications will require further investigation. For example, we have not 
addressed to date the issue of endowing KGP agents with learning capabilities 
which would allow to revise and augment suitability their knowledge base. We 
envisage that an additional Knowledge Revision transition could be smoothly in- 
corporated into the model, but its definition will require further thought. Also, 
we have not studied the problem of distinguishing between relevant and irrel- 
evant knowledge when reasoning in order to perform a specific user task. We 
envisage that this issue will be of great practical importance towards the actual 
use of KGP agents for applications. 

We are currently developing more detailed experiments with the aim to ex- 
tend the platform to allow KGP agents to interact not only with agents but also 
with objects in the environment in a generic way. Additional experimentation, 
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especially in the context of the reverse auction example, is under way. Also, 
the presentation of agent traces using a more “natural” interface is the subject 
of future work. To date, the experimentation has been based on PCs in a lab. 
Further experimentation in less controlled settings, e.g. wireless networks using 
PDAs, will be necessary to fully demonstrate the suitability of the model for 
Ambient Intelligence. 
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Abstract. Context-aware services platforms aim at supporting the handling of 
contextual information in order to provide better user-tailored services. This 
paper addresses our current efforts towards a configurable and extensible 
services platform for context-aware applications. It discusses the use of a 
language and ontologies to cope with configurability and extensibility aspects. 



1 Introduction 

Context-awareness has emerged as an important and desirable feature in distributed 
mobile systems. This feature deals with the ability of applications to utilize 
information about the user’s environment (context) in order to tailor services to the 
user’s current situation and needs [1], 

Building context-aware systems involves the consideration of several new 
challenges mainly related to the gathering/sensing, modeling, storing, distributing and 
monitoring of contextual information. These challenges justify the need for proper 
architectural support. In this work, we are particularly interested in services platform 
architectures to support context-aware applications. 

Ideally, a platform for context-aware applications should facilitate the creation and 
the dynamic deployment of a large range of applications, including those that are 
unanticipated at platform design-time. In this short paper, we briefly describe our 
current efforts towards a configurable and extensible services platform to support 
context-aware applications, which include the definition of a language to allow 
dynamic configuration of the platform, and the use of ontology support. 



2 The Services Platform 

The services platform forms the system environment for context-aware mobile 
applications. It supports the scenario in which context information is gathered from 
Context Providers (sensors or third-party information providers) and services are 
implemented by third-party service providers. The services platform aims at 
delivering the most adequate services based on both application requirements and 
contextual facts (see Fig. 1). Applications describe their requirements by defining the 
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desired services and the contextual conditions in which the services should be 
provided. The platform should autonomously react to reaction rules, which are 
defined in terms of conditions to be checked against contextual facts. 




Fig. 1 . Overview of the services platform 

We have identified the essential requirements to be satisfied by a services platform for 
context-aware applications [2], which include: (i) Reactivity to stimuli from the 
environment: the platform should allow the specification of events, which reflect 
particular changes in the users’ environment (contextual changes). In addition, the 
platform should allow the specification of actions that are the response to 
(combination of) these events; and (ii) Support for context handling: the platform 
should provide efficient mechanisms to gather, store, distribute and monitor 
contextual information. 

We have addressed some of these requirements when designing the platform 
architecture. Most of our efforts have been spent on developing an architecture with a 
high level of configurability. The proposed solution includes the definition of a 
subscription language, which allows applications to dynamically expose their needs to 
the platform. Fig. 2 depicts the proposed services platform architecture, which 
contains three main components: Monitor , Registry and Context Interpreter. 




The Context Interpreter gathers contextual information from Context Providers 
(sensors or third-party providers), manipulates contextual information and makes it 
uniformly available to the rest of the platform. The Registry maintains information 
necessary to support the interpretation of application requirements and the execution 
of services. The Monitor is the core of the platform, since it is responsible for 
receiving and interpreting application requests and making them active within the 
platform. The details of the services platform architecture can be found in [1], 

In our approach, application<->platform interactions are dynamically configured 
through the definition of application subscriptions. In a subscription, an application is 
capable of dynamically exposing its requirements to the platform, which composes 
new tailored services from the set of available services, at runtime. 
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2.1 The WASP Subscription Language (WSL) 

WSL is a descriptive language that we have developed in order to be able to specify 
application subscriptions. The main clause of this language is the ACTION-GUARD 
clause with which it is possible to specify condition and response actions. 

The ACTION-GUARD clause defines that one or more actions should be triggered 
as a consequence of a correlation of events. This clause allows one to represent 
actions that are performed by the platform as a reaction to stimuli defined in a logical 
expression. To illustrate the usage of this clause, consider a simple heath scenario in 
which a patient is being monitored for possible medical emergencies. In case an 
emergency occurs, help (an ambulance or a doctor) should be sent to the patient’s 
location; his close relatives need to be contacted; and the hospital needs to be 
informed of his arrival. An application can express this scenario by submitting the 
following application subscription to the support platform: 

ACTION sendHelp; callRelatives ; inf ormHospital 

GUARD j ohn . condition == emergency 

This is just an abstract view on the ACTION-GUARD clause semantics: condition is 
placed after the GUARD keyword, and the actions are listed after the ACTION 
keyword. More information on the language can be found in [1], 



2.2 Ontology Support 

The unambiguous comprehension of the meaning of contextual information among 
system parts is essential for the correct operation of context-aware systems. The 
current platform support for knowledge representation is restricted to UML Class 
diagrams [1], This approach may lack in expressiveness and rigor required to model 
context-aware systems. 

With ontologies, we can formalize the properties and structure of contextual 
information to guarantee common semantic understanding among system parts. In 
addition, ontologies offer inference and reasoning mechanisms that are necessary to 
derive complex contextual information, and reason about the context, respectively [3], 



Services Providers 




Fig. 3. Services platform using ontology support 



We are currently evaluating the applicability of ontologies to our services platform. 
Fig. 3 presents an extension of the services platform introducing the use of ontologies. 
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In this figure, the Context Providers provide the context information by means of 
messages defined in terms of a set of ontologies explicitly referenced, allowing the 
platform to understand these messages by using the semantics represented in the 
referenced ontologies. The context-aware applications subscribe to services provided 
by the platform by means of context specifications in the application subscription. We 
would like support ontologies not only to formalize context but also to exploit it for 
simple service descriptions and more complex service composition. 

The expected benefits of using ontologies are (i) Flexibility: knowledge is defined 
in terms of an ontology instead of “hardcoding” within the platform; (ii) More 
intelligent behavior: knowledge can be derived from the factual knowledge explicitly 
represented in the ontologies; (iii) Semantic interoperability: semantics of the 
(possibly several) languages used by the platform external parties can be defined in 
terms of a set of interrelated ontologies; and (iv) Expressiveness and consistency 
checking: context information is represented using a formal representation language, 
which enables to automatically check the consistency of the models. 



3 Final Remarks 

We have briefly presented in this paper our efforts towards a configurable and 
extensible services platform for context-aware applications. Our approach supports 
the configuration of applications platform interactions at runtime. In order to allow 
dynamic configuration of interactions we have introduced WSL, which is a 
descriptive language developed especially for this purpose. 

We have seen that WSL presents limitations, especially with respect to the 
definition of more complex event interrelations. The platform allows the definition of 
single independent events and the correlation of them using logical operations. 
Currently these events are tight to one specific application subscription. We would 
like to be able to define events that depend on other events (event composition) and 
also to specify events that can be reused by other application subscriptions. In our 
current efforts we consider applying rule-based languages (e.g. Jess [4]) to support 
more complex event interrelationships. 

Furthermore, we argue that UML Class diagrams may lack in expressiveness and 
rigor required to model context-aware systems. We have briefly discussed the benefits 
of using ontologies within the services platform. 
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Abstract. In this paper, we review different context classification systems that 
have been used to define elements of context. Although existing classification 
systems cover various types of context, in the development of context aware 
applications, only a few types of context have been used. In this work, we aim 
to build a context classification model based on Activity Theory that provides a 
basis both for dialogue amongst context awareness researchers and for the im- 
plementation of a context awareness architecture. 



1 Introduction 

In an ambient computing environment, the users are able to do their everyday life 
activities and at the same time access information or use computing services across 
multiple places and times. As a result, the user’s attention may be divided between 
several simultaneous activities. Moreover, ambient devices are becoming smaller to 
disappearing, resulting in usability issues. Researchers have attempted to improve 
user interaction through the notion of context awareness by exploiting information 
relating to users, devices and environments. 

Researchers in the context awareness field produce different definitions and classifi- 
cation systems for context, covering various elements of context. For the most part, 
however, context aware applications have utilized only isolated subsets of their con- 
text, such as a location or a device’s state. A truly context aware system needs to take 
account of the wide range of interrelated types of context and the relationships 
amongst them. As a precursor to implementing such systems, we need an approach to 
modelling context that takes account of this complexity. 

This paper starts by providing a review of some context classification systems and 
examples of projects. Secondly some problems with previous context classification 
systems are discussed. Activity Theory is then introduced as a potentially valuable 
approach to developing a comprehensive context classification with an example sce- 
nario used to demonstrate how different types of context and their relationships may 
be identified. Finally, we discuss the potential of applying Activity Theory in devel- 
oping a comprehensive context classification. 
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2 Related Work 

Researchers have tried to develop better understandings of context by producing 
context definitions and classification systems. Table 1 shows that different research- 
ers have presented context classification systems containing different elements. The 
columns in Table 1 are derived from elements that researchers have tried to classify 
as part of context and the rows show different classification systems. 



Table 1 . Past context classification systems 





Location 


Conditions 


Infrastructure 

(Computing 

Environment) 


Information on 
User 


Social 


User 

Activity 


Time 


Device 

Characteristics 


[Benerecetti etal.'OI] 


Physical Environment 




Cultural Context 








[Schmidt et al.'99] 


Physical Envronment 


Human Factor 


X 




[Lieberman and Selker'OO] 


User 

Environment 


Physical 

Environment 


X 


User 

Environment 






X 




[Hull et al.'97] 




Physical 

Environment 




X 








X 


[Chalmers and Sloman'99] 


X 




X 




X 


X 




X 


[Lucas'01] 


Physical Environment 


Information context 


X 


[Schilit et al'94] 


Physical Environment 


X 


User environment 








[Abowd and Dey'99] 


X 




Identity 


X 


X 


Identity 


[Chen & Kotz'OO] 


Acti\re/Passive 



In the first row of Table 1, Benerecetti, Bouquest and Bonifacio [2] have classified 
context into physical context and cultural context. Physical context is a set of features 
of the environment while cultural context includes user information, the social envi- 
ronment and beliefs. Schmidt et al [14] have extended the classification into three 
dimensions: physical environment, human factors and time. The human factors cover 
the same features as cultural context. They added time because time allows the con- 
text model to hold the history of context, which has influence on modelling the user’s 
past, current, and future action. 

Lieberman and Selker [12] have ignored time and classified context to include the 
physical environment, the user environment and the computing environment. In this 
case, the user environment includes the user’s location and is treated separately from 
the physical environment. Lieberman and Selker treat the computing environment as 
a separate entity here because they believe that information such as network avail- 
ability can be of interest to the user and related computing devices. Hull et al [9], 
Lucas [13] and Chalmers and Sloman [4] argue that characteristics of the device it- 
self, such as screen size and input device, are also of interest to the user and system. 
They have therefore defined device characteristics as one element of their context 
classification. Chalmers and Sloman have also added user activity into their context 
classification. However, they have ignored time and other user characteristics, which 
may be important elements of context. 

Dey [6] has provided a top-level classification system, which includes four types of 
context: location, identity, time and activity. Dey claims that these are primary types 
of context that can be used to refer to other secondary context. However, with this 
classification, there is no clear separation between device and user. The computing 
device and user should be treated differently because they have different features and 
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they affect user behaviour differently. Moreover, the primary context could lead to 
complications in computing process because the system has to spend time in translat- 
ing the primary context to secondary context before they can use it in the context 
aware system. 

The classification systems mentioned above are intended to be context models defin- 
ing what elements of context should be used to reason about the user in order to have 
a better understanding about the user’s intentions. Chen and Kotz [5] have introduced 
a classification system with a completely different aim where context is classified 
depending on how it is being used in the application. They have classified context 
very broadly into two types: active and passive where active context is that which 
influences the behaviours of an application, and passive context is that which is rele- 
vant but not critical to an application. 



3 Analysis of the Problem 

Table 1 shows that there is a multitude of context classification systems, all of which 
are partial, covering both similar and different elements. Therefore the first problem 
in the context awareness field is that it lacks a single model of context for designers to 
refer to so that they have the same understanding of context and understand what key 
elements should be taken into account in order to have a better understanding of us- 
ers’ behaviour. 

Another problem is that in the implementation process, context aware applications 
have utilized only isolated subsets of their context, such as a location or a device’s 
state, e.g. [1], There has been little research exploring the relationships between dif- 
ferent elements of context and how these relationships can affect the efficiency of 
context aware applications. These relationships are important in order to use context 
to represent the world of the user and to help the system better to understand the 
user’s activities and intentions, acknowledging that humans assimilate multiple items 
of information to perform everyday tasks. 



4 Activity Theory 

4.1 Why Activity Theory? 



Our main goal of building a context classification system is so that it can be used to 
build a conceptual model of a user’s activity, state and intentions. There are many 
approaches to analyzing and understanding human activity or tasks, such as Activity 
Theory [8] and Task Analysis [7, 10]. For the purpose of classifying context and 
attempting to relate existing partial classifications of context, we have developed an 
approach based on Activity Theory because it has the main characteristics described 
below. 
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It Provides a Standard Form for Describing Human Activity. Humans cannot 
fully understand the full moment-to-moment richness of other humans’ activities, 
states, goals and intentions. Yet they manage successfully and fluently to interact in 
many highly contextualised ways. Hence, in attempting to produce better context- 
aware systems, it is neither possible nor necessary to model all the richness of human 
activity. To make progress from the current state of the art, we propose that a 
sufficiently comprehensive context classification may be developed using the 
relatively simple standard form offered by Activity Theory that covers the key 
elements that have an influence on human activity. 

It Provides a Standard Form for Describing Human Activity. Humans cannot 
fully understand the full moment-to-moment richness of other humans’ activities, 
states, goals and intentions. Yet they manage successfully and fluently to interact in 
many highly contextualised ways. Hence, in attempting to produce better context- 
aware systems, it is neither possible nor necessary to model all the richness of human 
activity. To make progress from the current state of the art, we propose that a suffi- 
ciently comprehensive context classification may be developed using the relatively 
simple standard form offered by Activity Theory that covers the key elements that 
have an influence on human activity. 

It Relates Individual Human Activity to Society. In an ambient computing world, 
users are not isolated workers at a desktop, in an office. Users are using the 
computing services within society and that society will have an influence on the 
user’s activities. Therefore, the context classification should allow the system to take 
account of what can have an impact on human activity within society. Activity 
Theory explicitly takes society into account in its modelling. 

It Provides a Concept of Tool Mediation. Ambient computing users may use sev- 
eral devices or computing services at any time or place. Therefore their tools and the 
environment are changing all the time. Characteristics of tools have an influence on 
users’ activity. Activity Theory includes this in the model. 

It Maps the Relationships Amongst the Elements of a Human Activity Model. 

Activity Theory also maps the relationships amongst each element that it identifies as 
having an influence on human activity. This provides us with a potentially useful way 
to classify context and may be used to model the relationships between each element 
of context. 



4.2 Background 

Activity Theory was developed by Russian psychologists Vygotsky, Rubinshtein, 
Leont'ev and others beginning in the 1920s [11]. Activity Theory is a philosophical 
framework used to conceptualize human activities. In 1987 Engestrom [8] proposed a 
triangular structure of human activity as shown in Figure 1 . 

The main concepts of this model are: 

Subject: Information about the individual or subgroup chosen as the point of view in 
the analysis. 
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Tools: Information about tools, which can mean either psychological or physical 
tools. 

Community: Information about individuals or subgroups who share the same object. 
Division of labour: The division of tasks between members of the community. 

Rules: Explicit or implicit regulations, norms and conventions that constrain action or 
interaction. 

Object: Target of the activity within the system: subject’s intention or objective. 
Outcome: The result when the object is met. 




Fig. 1. The full structure of Activity Theory introduced by Engestrom 



Activity Theory describes and relates key elements that influence human activity. 
However, applying Activity Theory to provide a context classification model that 
covers all possible contexts in an ambient computing world is not a simple process. 
Further work is needed to develop a context classification model that can be used as a 
framework to interpret the context of user behaviour in a context aware system. 



4.3 Example Scenario 

In moving from Activity Theory to a context classification model, we have adopted a 
scenario-based approach [3]. Scenarios are used to help identify key elements and 
how they influence user activity. By way of illustration here, we provide a brief ac- 
count of an example scenario. 

Henry is a new PhD student. He is assigned to teach once a week on Tuesday 9.15- 
10.15am. On Tuesday at 9.15, he arrives at the teaching room and discovers that he 
has forgotten the lecture notes. Thus, he has to try to remember what his notes con- 
tained and reproduce them. In the end, he gives up and spends 15 minutes fetching 
the lecture notes from his office. 

Figure 2 shows how the context in this scenario is drawn out by the Activity The- 
ory model. The context-aware system presents a selection of files on his PDA based 
on his current location, time, people around him, his role, rules and tool availability. 
Once he has selected a file, the system presents the contents on a projector for the 
students to see and allows Henry to control it via his PDA. 
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Fig. 2. Classify the context in the scenario by using Activity Theory 



5 From Activity Theory to Context Model 

The example scenario above briefly illustrates that Activity Theory has the potential 
to be used as a foundation for producing a (sufficiently) comprehensive context clas- 
sification system. The elements in Activity Theory cover the key elements of context 
in the scenario that have an influence on human activity. Moreover, the relationship 
between each element is also identified. However, this model lacks a representation 
of history, which can be represented through time. 

Time is a crucially important part of context. This includes not just current time, but 
also past time (that contributes a history element to the context) and future time (that 
allows for prediction of users’ actions from the current context). Hence, we must 
account for time in our context models. We propose the context model illustrated in 
Figure 3. The elements in the model can be described as follows: 

User: Information about the user and her physical environment that has influence on 
her activity, including user’s current location, action, device and timetable. 

Tools and their availability: Tools those are available in the public space and their 
availability, including device characteristics, public services and computing environ- 
ment such as network availability. 

Rules: Norms, social rules and legislation within which the user relates to others in 
her community. 

Community: Information about people around the user (in both physical and virtual 
environments) that may have an influence on her activity. 

Division of Labour: Roles of user in that situation including who can perform which 
tasks on the object. 
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Object: User’s intention and objective. The system uses all the elements above to 
decide about the user’s intention or objective. 

Time: For our current purposes, time is the occurrence of events in the past, the pres- 
ent and the future. 




Fig. 3. Context model adapted from Activity Theory 



This is a first attempt at modelling a comprehensive context classification based on 
Activity Theory. A cycle of application, evaluation and iteration is required to ensure 
that the classification covers key elements in context awareness and identifies rele- 
vant relationships. Our next step is to generate more scenarios to produce a more 
comprehensive context classification model and then to evaluate this classification 
model. This model will then be used as part of the framework for implementing a 
context aware system. The system will then be tested with real users and evaluated to 
see if it reduces the user’s explicit input and provides the user with a usable context- 
aware system. 



6 Conclusion 

In this research, we aim to provide a comprehensive context classification system that 
includes the key elements of context that have an influence on a user’s diverse activi- 
ties in an ambient computing world. We also hope to identify the relationships be- 
tween each element in the classification so that these relationships may be applied 
during the development of a context aware system. This model can then be used a 
framework in the design process as the model will provide a better understanding of 
context. 
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Abstract. Pervasive resp. ubiquitous systems use context information 
to adapt appliance behavior to human needs. Even more convenience is 
reached if the appliance foresees the user’s desires. By means of context 
prediction systems get ready for future human activities and can act 
proactively. 

Predictions, however, are never 100% correct. In case of unreliable pre- 
diction results it is sometimes better to make no prediction instead of a 
wrong prediction. In this paper we propose three confidence estimation 
methods and apply them to our State Predictor Method. The confidence 
of a prediction is computed dynamically and predictions may only be 
done if the confidence exceeds a given barrier. Our evaluations are based 
on the Augsburg Indoor Location Tracking Benchmarks and show that 
the prediction accuracy with confidence estimation may rise by the fac- 
tor 1.95 over the prediction method without confidence estimation. With 
confidence estimation a prediction accuracy is reached up to 90%. 



1 Introduction 

Pervasive resp. ubiquitous systems aim at more convenience for daily activities 
by relieving humans from monotonic routines. Here context prediction plays a de- 
cisive role. Because of the sometimes unreliable results of predictions it is better 
to make no prediction instead of a wrong prediction. Humans may be frustrated 
by too many wrong predictions and won’t believe in further predictions even 
when the prediction accuracy improves over time. Therefore confidence estima- 
tion of context prediction methods is necessary. This paper proposes confidence 
estimation of the State Predictor Method [7,8,9,10], which is used for next lo- 
cation prediction. Three confidence estimation methods were developed for the 
State Predictor Method. 

The proposed confidence estimation techniques can also be transferred to 
other prediction methods like Markov, Neural Network, or Bayesian Predictors. 
In the next section the State Predictor Method is introduced followed by sec- 
tion 3 that defines three methods of confidence estimation. Section 4 gives the 
evaluation results, section 5 outlines the related work, and the paper ends by a 
conclusion. 



P. Markopoulos et al. (Eds.): EUSAI 2004, LNCS 3295, pp. 375—386, 2004. 
(c) Springer- Verlag Berlin Heidelberg 2004 
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2 The State Predictor Method 

The State Predictor Method for next context prediction is motivated by branch 
prediction techniques of current high performance microprocessors. We devel- 
oped various types of predictors, 1-state vs. 2-state, 1-level vs. 2-level, and global 
vs. local [7]. We describe two examples. 

First, the 1-level 2-state predictor, or shorter the 2-state predictor, is based 
on the 2-bit branch predictor. Analogically to the branch predictor the 2-state 
predictor holds two states for every possible next context (a weak and a strong 
state). 

Figure 1 shows the state diagram of a 2-state predictor for a person who is 
currently in context X with the three possible next contexts A , B , and C. If the 
person is the first time in context X and changes e.g. to context C the predictor 
sets CO as initial state. Next time the person is in context X context C will 
be predicted as next context. If the prediction proves as correct the predictor 
switches into the strong state Cl predicting the context C as next context again. 
As long as the prediction is correct the predictor stays in the strong state Cl. 
If the person changes from context X in another context unequal C (e.g. A), 
the predictor switches in the weak state CO predicting still context C. If the 
predictor is in a weak state (e.g. CO) and misses again, then the predictor is 
set to the weak state of the new context which will be predicted next time. The 
2-state predictor can be dynamically extended by new following contexts during 
run-time. 



A 




Fig. 1 . 2-state predictor with three contexts 

The second example is the global 2-level 2-state predictor which consists of 
a shift register and a pattern history table (see figure 2). A shift register of 
length n stores the pattern of the last n occurred contexts. By this pattern the 
shift register selects a 2-state predictor in the pattern history table which holds 
for every pattern a 2-state predictor. These 2-state predictors are used for the 
prediction. The length of the shift register is called the order of the predictor. 
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Fig. 2. Global 2-level 2-state predictor 

The 2-level predictors can be extended using the Prediction by Partial Match- 
ing (PPM) [10] or the Simple Prediction by Partial Matching (SPPM). PPM 
means, a maximum order m is applied in the first level instead of the fixed or- 
der. Then, starting with this maximum order ?n, a pattern is searched according 
to the last m contexts. If no pattern of the length m is found, the pattern of 
the length m— 1 is looked for, i.e. the last m — 1 contexts. This process can be 
accomplished until the order 1 is reached. SPPM means, if the predictor with 
the highest order can’t deliver a result, the predictor with order 1 is requested 
to do the prediction. 

3 Confidence Estimation Methods 

3.1 Strong State 

The Strong State confidence estimation method is an extension of the 2-state 
predictor and be applied to the 2-state predictor itself and the 2-level predictor 
with 2-state predictor in the second level. 



pattern 


2-state 






Pi ■■■Pr 


Cl 







A 




Fig. 3. Strong State - Prediction only in the strong states 

The idea of the Strong State method is simply that the predictor supplies the 
prediction result only if the prediction is in a strong state. The difference between 
weak state (no prediction provided because of low confidence) and strong state 
(the context transition has been done at least two consecutive times) acts as 
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barrier for a good prediction. Figure 3 shows the 2-state predictor for the three 
contexts A, B , and C. The prediction result of the appropriate context will be 
supplied only in the strong states which are double framed in the figure. 

As a generalization this method can be extended for the n-state predictor 
that uses n (n > 2) states instead of the two states of the 2-state predictor. 
Therefore a barrier k with 1 < k < n must be chosen to separate the weak and 
the strong states. For the 2-state predictor (n = 2) it is k = 1. Let 1 < s < n be 
the value of the current state of the 2-state predictor, then we can define: 

s > k : supply prediction result (1) 

s < k : detain prediction result (2) 

No additional memory costs arise from the Strong State method. 



3.2 Threshold 



The Threshold confidence estimation method is independent of the used predic- 
tion algorithm. This method compares the accuracy of the previous predictions 
with a given threshold. If the prediction accuracy is greater than the threshold 
the prediction is assumed confident and the prediction result will be supplied. For 
the 2-level predictors we consider all 2-state predictors for itself. That means for 
all patterns we investigate separately the prediction accuracy of the appropriate 
2-state predictor. The prediction accuracy of a 2-state predictor is calculated 
from the numbers of correct and incorrect predictions of this predictor, more 
precisely, the prediction accuracy is the fraction of the correct predictions and 
the number of all predictions. Let c be the number of correct predictions, i the 
number of incorrect predictions, and a the threshold, then the method can be 
described as follows: 



: > a : supply prediction result 

c + 1 

< a : detain prediction result 

c + 1 



( 3 ) 

( 4 ) 



The threshold is a value between 0 (only mispredictions) and 1 (100% cor- 
rect predictions in the past) and should be chosen well above 0.5 (50% correct 
predictions) . 

For the global 2-level predictors the pattern history table must be extended 
by two values, the number of correct predictions c and the number of incorrect 
predictions i (see figure 4). 

The global 2-level 2-state predictor starts with an empty pattern history ta- 
ble. After the first occurrence of a pattern the predictor can’t deliver a prediction 
result. If the context c follows this pattern the 2-state predictor will be initialized 
with the weak state of the context c. Furthermore 0 will be set as initial values 
for the entries of the number of correct predictions and the number of incorrect 
predictions. After the second occurrence of this pattern a first prediction is done 
but with unknown reliability. The system cannot estimate the confidence, since 
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Fig. 4. Global 2-level 2-state predictor with threshold 

there are no prediction results yet. Also the values of the correct and incorrect 
predictions cannot be calculated by the formula for estimating the confidence, 
because the denominator is equal 0. If the prediction is proved as correct the 
value of the number of correct predictions will be increased to 1. Otherwise the 
number of incorrect predictions will be incremented. Furthermore the 2-state 
predictor will be adapted accordingly. After the next occurrence of the pattern 
the values can be calculated by the formula and the result of the formula can be 
compared with the threshold. 

A disadvantage of the Threshold method is the “getting stuck” over the 
threshold. That means if the previous predictions were all correct the prediction 
accuracy is much greater than the threshold. If now a behavior change to another 
context occurs and incorrect predictions follow, it takes a long time until the 
prediction accuracy is less than the threshold. Thus the unconfident predictor 
will be considered as confident. A remedy is the method with confidence counter. 

3.3 Confidence Counter 

The Confidence Counter method is independent of the used prediction algorithm, 
too. This method estimates the prediction accuracy with a saturation counter. 
The counter consists of n + 1 states (see figure 5). 



pattern 


2-state 


C 


i 










Pi - --Pr 


Cl 
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y 












Fig. 5. Confidence Counter - state graph 



The initial state can be chosen optionally. Let s be the current state of the 
confidence counter. If a prediction result is proved as correct (c) the counter will 
increase, that means the state graph changes from state s into the state s + 1. 
If s = n the counter keeps the state s. Otherwise if the prediction is incorrect 
(z) the counter switches into the state s — 1. If s = 0 the counter keeps the state 
s. Furthermore there is a state k with 0 < k < n + 1, which acts as a barrier 
value: If s is greater or equal k the predictor is assumed as “confident”, otherwise 
the predictor is unconfident and the prediction result will not be supplied. The 
method can be described as follows. 



s > k : supply prediction result 
s < k : detain prediction result 



( 5 ) 

( 6 ) 
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The prediction accuracy will be considered separately for every 2-state pre- 
dictor. The pattern history table must be extended by one column which stores 
the current state of the confidence counter cc (see figure 6). Memory costs are 
between the strong state and the threshold methods. 



shift register 
Ipi ■ ■ - Pr |» 



pattern history table 
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Fig. 6. Global 2-level 2-state predictor with confidence counter 

In the following the initial phase for the example of the global 2-level 2-state 
predictor is explained. Initially the pattern history table is empty. After the 
first occurrence of a pattern the algorithm cannot deliver a prediction result. 
If the context c follows, the 2-state predictor of this pattern will be initialized. 
Furthermore the confidence counter of this pattern sets a value between 0 and 
n as initial value, for example the value k. After the second occurrence of the 
pattern the confidence of the 2-state predictor can be estimated by the initial 
state of the confidence counter. If k was chosen as the initial state the prediction 
result will be supplied, since k classifies the predictor as confident. Otherwise, if 
the current state of the confidence counter is less than k, the prediction result 
will be detained. 



4 Evaluation 

For our application, the Smart. Doorplates [12], we choose next location predic- 
tion instead of general context prediction. The Smart. Doorplates direct visitors 
to the current location of an office owner based on a location tracking system 
and predict the next location of the office owner if absent. 

The evaluation was performed with the Augsburg Indoor Location Tracking 
Benchmarks [6]. These benchmarks consist of movement sequences of four test 
persons (two of these are selected in the following) in a university office building 
reported separately during the summer term and the fall term 2003. In the 
evaluation we don’t consider the corridor, because a person can leave a room 
only to the corridor. Furthermore we investigate only the global 2-level 2-state 
predictors with order 1 to 5, the PPM predictor, and the SPPM predictor. All 
considered predictors were evaluated for every test person with the corresponding 
summer benchmarks followed by the fall benchmarks. The abbreviations in the 
figures and tables stand for: G - global, 2L - 2-level, 2S - 2-state. The order and 
the maximum order respectively is given in parentheses. 

The training phase of the State Predictor Method depends on the order of the 
predictor. As some patterns might never occur for predictors with high order, 
the training phase of the 2-state predictors in the second level never starts. 
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Therefore we consider the training for every pattern separately. The training 
phase of the 2-state predictor for a pattern includes only the first occurrence of 
the pattern. After the second occurrence of the pattern the training phase of the 
2-state predictor ends. 

Next we introduce some definitions which will be needed for explanations: 



— demand - number of the predictions which are requested by the user. In 
our evaluation this number corresponds with the number of rooms a test 
person has entered during the measurements. 

— supply — number of prediction results which are delivered from the sys- 
tem. This corresponds with the number of confident predictions which can 
be calculated from the requested predictions minus the predictions of the 
training phase of every 2-state predictor and the unconfident predictions. 

— quality - fraction of the number of the correct predictions and the supply. 

# correct predictions 

quality = 

supply 

— quantity - ratio of supply and demand 



quantity 



supply 

demand 



— gain - the factor which gives the improvement of the quality with confi- 
dence estimation opposite the quality without confidence estimation. 



quality with confidence estimation 
^ quality without confidence estimation 



4.1 Strong State 

Figures 7 and 8 and tables 1 and 2 show the results of the measurements with 
the Strong State method. The figures show the quality and the quantity of the 
prediction method with and without confidence estimation. The graphs with 
confidence estimation are denoted by SS in brackets. Quality, quantity, and gain 
were measured for the two test persons. 



Table 1. Gain - Person A Table 2. Gain - Person D 



G-2L-2S 


(1) 


(2) 


(3) 


(4) 


(5) 


Gain 


1,46 


1,48 


1,47 


1,50 


1,39 


G-2L-2S 


-PPM(5) 


-SPPM(5) 




Gain 


1,64 


1,53 





G-2L-2S 


(1) 


(2) 


(3) 


(4) 


(5) 


Gain 


1,74 


1,53 


1,39 


1,42 


1,28 


G-2L-2S 


-PPM(5) 


-SPPM(5) 




Gain 


1,40 


1,62 





The gain is always greater than 1, that means the quality improves with 
confidence estimation for all predictors. The greatest gain (1.74) is reached by 
the global 2-level 2-state predictor with order 1 of test person D (see figure 8 and 
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Strong state - test person D 




□ Quality 

| □ Quantity | 



Fig. 7. Quality and Quantity - Person A Fig. 8. Quality and Quantity - Person D 



table 2). The reason is that the predictor very often changes between various 
weak states. 

If we consider the quantity we can see that the value falls for all measure- 
ments. In the case of the global 2-level 2-state predictor with order 5 the quantity 
falls even below 10%. That means the algorithm delivers the prediction result 
only in one out of ten requests. 



4.2 Threshold 

Figures 9 to 14 show the measurement results of the Threshold method. For 
the two test persons the quality, the quantity, and the gain were measured. The 
threshold varies between 0.2 and 0.8. 




Fig. 9. Quality - Person A Fig. 10. Quality - Person D 

Normally the quality can be increased by increasing the threshold. The figures 
of test person A show that the quality decreases by a threshold greater than or 
equal 0.7. The reason is that the 2-state predictors reach a prediction accuracy 
between 60 and 70 percent. The best quality with threshold was reached by the 
predictor with order 1 for all test persons. 

Also here the quantity decreases by increasing the threshold. The best quan- 
tity with a high threshold was reached by the global 2-level 2-state predictor 
with Prediction by Partial Matching. The gain shows the same behavior as the 
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Fig. 13. Gain - Person A 



Fig. 14. Gain - Person D 



quality. If the threshold increases the gain increases too. The measurements of 
test person A show that the gain decreases if the threshold is greater than or 
equal 0.7. The best gain of 1.9 was reached by the predictor G-2L-2S(1) with a 
threshold value of 0.9 for test person D (see figure 14). 



4.3 Confidence Counter 

Figures 15 to 20 show the result of the measurements of the Confidence Counter 
method. For evaluation a confidence counter with n = 3 was chosen, which can 
be described by two bits as follows: 




We set the state 10 as the initial state. We performed the measurements 
with k e {00,01, 10,11}, whereas k = 00 means the predictor works analog to 
the correspond predictor without confidence estimation. Again the quality, the 
quantity, and the gain were measured. 

As expected the quality increases in all cases if k increases. Accordingly the 
quantity decreases with a higher k. The gain is analog to the quality. The best 
quality of 90% (see figure 15) and the best gain of 1.95 (see figure 20) was reached 
by the predictor G-2L-2S(1) with k = 11. For all test persons the best quantity 
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Confidence counter - test person D - quality 
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Fig. 15. Quality - Person A 



Fig. 16. Quality - Person D 




Fig. 17. Quantity - Person A 



Fig. 18. Quantity - Person D 




Confidence counter - test person D - gain 




Fig. 19. Gain - Person A 



Fig. 20. Gain - Person D 



were reached by the predictors with prediction by partial matching followed by 
the predictors with simple prediction by partial matching. The predictors with 
order 5 reached the lowest quantity in all cases. 



5 Related Work 

A number of context prediction methods are used in ubiquitous computing [1,4, 
5,14,15]. None of them regards the confidence of the predictions. 
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Because the State Predictor Method was motivated by some branch predic- 
tion methods of current high performance microprocessors, we are influenced by 
the confidence estimation methods in this research area. Smith [11] proposed 
a confidence estimator based on the saturation counter method. This approach 
corresponds to the Confidence Counter method. Grunwald et al. [2] investigated 
additionally to the given techniques a method which counts the correct predic- 
tions of every branch and compare this number with a threshold. If this number 
is greater than the threshold the branch has a high confidence, otherwise a low 
confidence. This method corresponds to the Threshold confidence estimation 
method. 

Jacobson et al. [3] use a table of shift registers as confidence estimator, the so 
called n-bit correct/incorrect registers additionally to the base branch predictor. 
For every correct prediction a 0 and for a incorrect prediction a 1 will be entered 
into the register. A so-called reduction function classifies the confidence as high 
or low. An example of the function is the number of l’s compared to a threshold. 
If the number of l’s is greater than the threshold, the confidence is classified as 
low, otherwise as high. Furthermore Jacobson et al. propose a two level method 
which works with two of the introduced tables. 

Tyson et al. [13] investigated the distribution of the incorrect predictions. 
They observed that a low number of branches include the main part of incorrect 
predictions. They proposed a confidence estimator which assigns a high confi- 
dence to determined number of branches, and a low confidence to the other. 

The methods of Jacobson et al. and Tyson et al. may influence further con- 
fidence estimation methods in context prediction. 



6 Conclusion 



This paper introduced confidence estimation into the State Predictor Method. 
The motivation was that in many cases it is better to make no prediction in- 
stead of a wrong prediction to avoid frustrations by too many mispredictions. 
The confidence estimation methods may automatically suppress predictions in 
a first training state of the predictor where mispredictions may occur often. We 
proposed three methods for confidence estimation. 

For all three methods the prediction accuracy increases with confidence esti- 
mation. The best gain of 1.95 was reached for the Confidence Counter method. 
In this case the prediction accuracy without confidence estimation was 44.5%, 
and the prediction accuracy with confidence estimation reached 86.6%. But the 
quantity decreases with the estimation, that means the system delivers predic- 
tion results less often. The best result was reached again with the Confidence 
Counter method. Here a prediction accuracy of about 90% is reached. But in 
this case the quantity is about 33%. 

We plan to combine confidence estimation with other predition methods, in 
particular Markov, Neural Network, and Bayesian Predictors. 
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